this post was submitted on 19 Nov 2025

32 points (94.4% liked)

Programming

23517 readers

220 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Follow the programming.dev instance rules
Keep content related to programming in some way
If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev

founded 2 years ago

MODERATORS

snowe@programming.dev

Ategon@programming.dev

MaungaHikoi@lemmy.nz

UlrikHD@programming.dev

What's the best way to monitor an API for breaking changes? (sh.itjust.works)

submitted 20 hours ago by clay_pidgin@sh.itjust.works to c/programming@programming.dev

36 comments fedilink hide all child comments

I have a vendor that sucks donkey balls. Their systems break often. An endpoint we rely on will start returning [] and take months to fix. They'll change a data label in their backend and not notice that it flows into all of their filters and stuff.

I have some alerts when my consumers break, but I think I'd like something more direct. What's the best way to monitor an external API?

I'm imagining some very basic ML that can pop up and tell me that something has changed, like there are more hosts or categories or whatever than usual, that a structure has gone blank or is missing, that some field has gone to 0 or null across the structure. Heck, that a field name has changed.

Is the best way to basically write tests for everything I can think of, and add more as things break, or is there a better tool? I see API monitoring tools but they are for calculating availability for your own APIs, not for enforcing someone else's!

top 36 comments

sorted by: hot top controversial new old

[–] HaraldvonBlauzahn@feddit.org 1 points 3 hours ago* (last edited 3 hours ago)

It is essentially an attitude and value problem. See Torvalds' email titled "WE DON'T BREAK USER SPACE" and Rich Hickeys talk "Spec-ulation" on YouTube.

Consequently, the fix is to move to another vendor.

[–] villainy@lemmy.world 4 points 12 hours ago (1 children)

There is no sure-fire technical solution. So you name and shame, far and wide, until it affects their bottom line.

[–] clay_pidgin@sh.itjust.works 1 points 12 hours ago

We're both in a really niche market and the other vendors don't seem much better!

[–] MagicShel@lemmy.zip 5 points 13 hours ago (1 children)

This is not a problem that has a technical solution. This requires a business solution—stop doing business with that vendor. Whatever service agreement exists between your companies is either not being enforced or was negotiated by a drunken mule.

[–] clay_pidgin@sh.itjust.works 2 points 13 hours ago

Appreciate the input. You aren't wrong!

[–] Vincent@feddit.nl 17 points 19 hours ago (1 children)

Really depends on your infrastructure, but I'd set up some snapshot tests that just make calls to the APIs with known responses, and run that in a cronjob and have it alert you if it fails.

[–] clay_pidgin@sh.itjust.works 3 points 18 hours ago (2 children)

They haven't so far broken the historical data, so I can't directly compare a response to a known good, sadly.

[–] cv_octavio@piefed.ca 9 points 18 hours ago (1 children)

Don't these clowns version their API?

[–] clay_pidgin@sh.itjust.works 3 points 18 hours ago* (last edited 17 hours ago) (1 children)

Not that I've seen! No endpoint tells me anything about the API or endpoint. Would that be in the response headers, maybe? I'll check, but they're bad at change control anyway and they use slightly different versions of their systems for each customer, so there's not really a unified version number anyway.

edit: Nothing in the headers.

[–] cv_octavio@piefed.ca 5 points 13 hours ago (1 children)

I mean.... We version ours in the url.

/api/v1/some_enpoint

That way if, for whatever reason, you need to roll a breaking change, you do it in a new version mapped to a new url.

I'm sorry for what you're going through, I've been there before.

[–] clay_pidgin@sh.itjust.works 6 points 13 hours ago

They've never rolled out a breaking change INTENTIONALLY, which is a fun distinction!

[–] calliope@retrolemmy.com 2 points 18 hours ago (1 children)

You can compare the status to a 500 or a 404 though, to see if it’s running?

When it breaks, you’ll know.

[–] clay_pidgin@sh.itjust.works 3 points 18 hours ago (1 children)

I do that, at least. Most recent problem was one endpoint returning [] instead of a bunch of JSON, still with a 200.

[–] calliope@retrolemmy.com 3 points 17 hours ago (1 children)

Oh duh, I should have known someone would return an empty object/collection or a string or something (“Error”) and 200!

I feel like sometimes monitoring is a bit like whack-a-mole.

[–] clay_pidgin@sh.itjust.works 3 points 17 hours ago

That's my feeling, too.

[–] NABDad@lemmy.world 4 points 15 hours ago

I'm just thrown by you saying you have a vendor that sucks donkey balls. If you only have one that sucks donkey balls, that seems unreal to me.

My group supports around 65 applications, and I'd find it a hell of a lot easier to list the vendors that don't suck donkey balls.

I think there's one. Maybe.

[–] x00z@lemmy.world 2 points 14 hours ago (1 children)

You might be losing more money using this one than changing for a more expensive but competent provider.

I have only came across one provider that we couldn't replace and in that case we got them to export their data directly instead of wasting time using their awful API.

[–] clay_pidgin@sh.itjust.works 1 points 14 hours ago

Luckily it's not up to me, but I agree.

I've been complaining about the API for their main custom application, but they also have a ton of data in Salesforce and they screwed up when they set it up, so it's not multitenanted or anything. I can't have the API because I would be able to see and modify every customers' data.

They're awesome.

[–] HubertManne@piefed.social 2 points 16 hours ago

synthetics. the big question is how often to run the checks and how many you will need to make for your use cases.

[–] amlor@piefed.social 4 points 19 hours ago* (last edited 19 hours ago) (1 children)

In my last place of work we just used a small perl script for such monitoring. You just recursively parse the whole body, save which paths exist and what type of data they have into db. When something changes it posted an alert to a webhook. Your case is a bit more complicated, but not by much.

[–] clay_pidgin@sh.itjust.works 1 points 18 hours ago

I'm not sure what you mean on the first part. I've read that you should be able to sort of walk through a RESTful API via references to other "tables", but this API doesn't work like that. There's no endpoint that lists endpoints.

All of the responses are dozens to hundreds of lines of JSON, often with a few of the fields for each entry being present or absent depending.

[–] yaroto98@lemmy.world 3 points 18 hours ago (1 children)

Do they use openapi or swagger or something? If so you should be able to do something like use changedetect.io on their swaggerdocs page.

[–] clay_pidgin@sh.itjust.works 3 points 18 hours ago (2 children)

They generate a swaggger file for me on request with a lag time of weeks usually, but for only one of the APIs. The others are documented in emails basically. This is a B2B type of thing, they are not publicly available APIs.

[–] Nomad@infosec.pub 2 points 13 hours ago (1 children)

Ask them to generate a schema file that you can download from the api. Or at least an endpoint that returns a hash of the current api schema file. That's cheap versioning telling you if something changes.

You can always use the swagger schema to verify the api. So ask some basic questions what should always be true and put that into validation scripts. If they use a framework, HEAD requests usually tell you some things.

Last really bad vendor had an openapi page that listed the endpoints but the api wouldn't adhere to the details given there. I discovered that their website used the api all the time and surfing that i was able to discover which parameters were required etc.

Last idea is statistics. Grab any count data you can get, like from pagination data and create a baseline of available data over time. That gives you an expected count and you can detect significant divergences.

I tend to show up at the vendors it guys in person and bribe them into helping me behind their bosses backs. Chocolate, coffee and some banter can do wonders.

[–] clay_pidgin@sh.itjust.works 1 points 12 hours ago* (last edited 7 hours ago)

I'm 3,500 miles from the vendor's devs, sadly.

Asking them to put the swagger file itself behind the API is a good idea. Their dev backlog is 3-24 months.

I used the same trick to determine the required headers and parameters - I checked their website which uses the same API.

The source of their delays is that different devs or teams "own" different endpoints and make their changes without documenting. It's annoying, stuff like the same data being in field "hostId" on one endpoint but "deviceId" on another.

[–] yaroto98@lemmy.world 2 points 17 hours ago (1 children)

Are any of their apis a GET that returns lists? I create a lot of automated api tests. You might be able to GET a list of users (or whatever) then pick a random 10 user_ids and query another api, say user_addresses and pass in each id one at a time and verify a proper result. You don't have to verify the data itself, just that the values you care about are not empty and they key exists.

You can dynamically test a lot this way and if a key gets changed from 'street' to 'street_address' your failing tests should let you know.

[–] clay_pidgin@sh.itjust.works 2 points 17 hours ago (1 children)

Unfortunately on the main API I use of theirs, there's an endpoint with a list of objects and their IDs, and those IDs are used everywhere else. The rest of the endpoints aren't connected. I can't walk e.g. school > students > student > grades or something

[–] yaroto98@lemmy.world 2 points 17 hours ago (1 children)

I made my career out of automated testing with a focus on apis. I'm not aware of any easy tool to do what you want. The easiest way to quick whip up basic api tests that I've found is python/pytest with requests. You can parameterize lots of inputs, run tests in parallel, easily add new endpoints as you go, benchmark the apis for response times, etc. It'll take a lot of work in the beginning, then save you a lot of work in the end.

Now, AI will be able to make the process go faster. If you give it a sample input and output it can do 95% of a pytest in 10s. But beware that last 5%.

[–] jjjalljs@ttrpg.network 2 points 16 hours ago

Yeah I would use python and pytest, probably.

You need to decide what you expect to be a passing case. Known keys are all there? All values in acceptable range? Do you have anything where you know exactly what the response should be?

How many endpoints are there?

[–] whotookkarl@lemmy.dbzer0.com 2 points 17 hours ago (1 children)

A couple approaches are setting up a batch process on a frequent interval to call the API and run tests against the responses, another is to have the service consumer publish events to a message bus & monitor the events. It depends on things like do I own both the service and client or just client, can I make changes to the client or just add monitoring externally, and if I can run test requests without creating/updating/destroying data like a read only service, or if I need real requests to observe.

[–] clay_pidgin@sh.itjust.works 2 points 14 hours ago (1 children)

The main one I have issues with is a read only API. I guess I make it harder on myself from this perspective by not maintaining one big client, but lots of separate single-purpose tools.

[–] whotookkarl@lemmy.dbzer0.com 2 points 14 hours ago (1 children)

Yeah then I would setup a call or set of calls on an interval to test the response on, and if a critical test fails send an alert, if there are less critical alerts maybe treat as warnings and send a report periodically. In either case I'd log and archive all of it so if they are bullshitting or violating contact SLAs I'll have some data to reference.

[–] clay_pidgin@sh.itjust.works 2 points 14 hours ago (1 children)

They do have an API Accuracy SLA but it's not defined anywhere so we do our best. They've only avoided penalties a few months out of the last several years!

[–] whotookkarl@lemmy.dbzer0.com 2 points 13 hours ago* (last edited 13 hours ago)

Oof that is a rough one. If they are just absorbing the penalties it sounds like the penalties need to be increased to make it more financially necessary to change the incentive to actually do the work, but in the meantime I'd just collect and report on as much data as I could.

[–] asudox@lemmy.asudox.dev 1 points 19 hours ago* (last edited 19 hours ago) (1 children)

Check out Semantic Versioning if they use it.

It's very nice.

[–] clay_pidgin@sh.itjust.works 12 points 19 hours ago

No, they don't have version numbers and they don't provide release notes when they change things intentionally. The more common problem for me is when they break it and don't notice.