New Feature: Garbage In, Concert History Out
Key Takeaways
- The input is a product decision. A blank box that accepts anything is a deliberate choice to meet users where their data actually lives, not the absence of a design decision
- LLMs are exceptional preprocessing layers. They parse and bring context. An LLM that knows Pearl Jam toured Wrigley Field is more useful than one that just extracts tokens
- Verification is what makes AI output trustworthy. The LLM structures the data; the API confirms it. Neither is enough alone
- Design for ambiguity, not against it. When the system can’t be certain, surface the options and let the user decide. Surfacing what you don’t know is the honest design
Somewhere in my house is a shoebox of ticket stubs. Hundreds of them, from shows going back decades. Some are pristine. Some are torn. A few are completely faded. And on the wall, there’s a poster collection that tells the same incomplete story: the shows I cared enough to frame. A record, but not a complete one.
Pearl Jam alone: 53 shows. Hundreds more across everything else. And for plenty of those, I couldn’t tell you with confidence exactly when they were.
That’s the problem I sat down to solve. A personal one, more than a technical one. TicketTrek started as a planning tool: upcoming shows, trip logistics, what’s worth traveling for. But the longer I used it, the more I wanted it to be something more. Not just where I’m going, but where I’ve been. A canonical record of every show I’ve seen: verified, dated, complete. The first one I’ve ever had.
Building that meant importing the past. And the past doesn’t live in clean fields.
The Data Problem Everyone Has
Concert history doesn’t live in a database. It lives in:
- Fan shorthand (“LA 1, LA 2, Wrigley 2016”)
- Show history pages on artist sites, formatted for reading rather than importing
- Email confirmations from three different ticketing platforms
- Calendar entries with names like “PJ!!!!”
- A shoebox of stubs
- Memory, which is almost always wrong on the details
I already had a useful API for this: Setlist.fm, which has a deep catalog of verified show data (dates, venues, setlists). If I could get a query to it, I could confirm a show actually happened when and where I thought it did. That’s what “canonical” requires. Not what I remember. What actually occurred.
The conventional path to getting there: a search form. Artist name, year, city. Fill in the fields, submit, match. Clean, accurate, and completely mismatched to how this data actually exists. Nobody’s Pearl Jam history lives in three tidy fields. Mine certainly doesn’t.
A form that rigid works for a small number of recent shows where the details are fresh. For hundreds of shows spanning decades, in the formats they actually exist in, it’s friction, the kind that makes a feature feel like a chore and kills the motivation to use it at all.
So I didn’t build the form.
“For hundreds of shows spanning decades, in the formats they actually exist in, it’s friction, the kind that makes a feature feel like a chore and kills the motivation to use it at all.”
The Blank Box
Paste whatever you have. The AI figures out the rest.
The blank box is a deliberate decision to move the translation problem, from human memory to structured data, to the part of the stack that’s actually good at it.
Here’s how it works:
The input goes to a Lambda function backed by Gemini 2.5 Flash. The model’s job is to parse whatever arrives, shorthand, prose, abbreviations, partial dates, and return structured JSON: artist, year, city, venue where available. That JSON is what Setlist.fm can actually work with.
What makes this more than a fancy text parser is that the LLM isn’t working blind. It brings context. It knows that “Wrigley” in a Pearl Jam context means Wrigley Field in Chicago. It has a general sense of when bands toured and where. Not definitive knowledge, but enough to make an informed first pass. Think of it as a data entry person who’s also an enthusiast. They’re not guessing randomly. They’re making educated inferences, then handing the work to verification.
That distinction matters. The LLM is the preprocessing layer that gets the data into a shape the API can confirm, not the source of truth. The two together do what neither can do alone.
The Honest Part
Memory fails. That’s the whole reason the system exists.
When Setlist.fm returns multiple possible matches, the app doesn’t pick one. It surfaces all of them and asks. Two nights at the United Center in Chicago in August 2009. Four possible New York shows in 2010, three at Madison Square Garden, one at the Prudential Center in Newark, and even the Saturday Night Live appearance from March of that year. I went to some of those. The system doesn’t pretend to know which.
This is where the canonical promise gets kept. The user is the final authority. The triage screen isn’t a fallback for when the system fails. It’s the design acknowledging that human memory is imprecise, that shows get misremembered, and that a verified record requires a human to close the loop.
The output is confirmed, API-verified, and user-approved. That’s what makes it worth storing.
The Bigger Picture
The concert app is a personal project. But the pattern is not unique to concert history.
Every organization has data that exists in the wrong format, field notes that never reach the CRM, institutional knowledge trapped in someone’s inbox, history that lives in a spreadsheet nobody can find.
I’ve watched the same pattern cost real money in M&A integration work, institutional knowledge that lived in a departing executive’s head, inaccessible not because it wasn’t valuable, but because nobody built the on-ramp for it.
The standard solution is a form: enforce structure at the point of entry, make the data problem the user’s problem. The AI-native alternative is different: accept the data as it exists, and translate it on the way in. The friction of a rigid input is a people problem. The translation is a technical one. Those aren’t the same problem, and conflating them is why so many data initiatives stall before they start.
The shoebox and the poster wall aren’t going anywhere. But for the first time, I have somewhere to put what they represent, and a way to get it there without filling out a form I was never going to finish.
These are my personal observations from tinkering on a side project, they don’t necessarily reflect the views of my firm. That said, we have some great AI tools and solutions, and I’d love to tell you about them.