ETL Pipeline to Website Design Suggestions
Unanswered
Sphynx posted this in #help-forum
SphynxOP
I have several python scripts that parse through large JSON even logs and extract information from them. The parsing may be computationally expensive. The data then needs to be stored in some DB like Postgres before my Next app touches it. I'm not very familiar with Next or Prisma, and this is my first web app.
So far, the pipeline I'm thinking of is: parsing ->
Does this sound good? Also, would you recommend keeping everything in the same folder or keep the Next app and the event log parsing separate?
So far, the pipeline I'm thinking of is: parsing ->
insert to DB using SQLAlchemy ORM -> PostgreSQL DB <- Prisma (Next.js queries)Does this sound good? Also, would you recommend keeping everything in the same folder or keep the Next app and the event log parsing separate?
9 Replies
@Sphynx I have several python scripts that parse through large JSON even logs and extract information from them. The parsing may be computationally expensive. The data then needs to be stored in some DB like Postgres before my Next app touches it. I'm not very familiar with Next or Prisma, and this is my first web app.
So far, the pipeline I'm thinking of is: parsing -> `insert to DB using SQLAlchemy ORM -> PostgreSQL DB <- Prisma (Next.js queries) `
Does this sound good? Also, would you recommend keeping everything in the same folder or keep the Next app and the event log parsing separate?
Sun bear
- when exactly does this processing happen? as a result of user input, or on its own basis? if you are going to drive this as a result of user interaction to your website -
- can you rewrite this in javascript? are you parsing it yourself manually or relying on some library like pandas or numpy to do the heavy computation
- can you rewrite this in javascript? are you parsing it yourself manually or relying on some library like pandas or numpy to do the heavy computation
SphynxOP
I'll have some script that is continuously running on some server calling an API that checks real world events (Fortnite tournaments). Once some tournament occurs/ends, it'll fetch match event logs and pass them to the parsers. A good chunk of the code is vectorized using
numpy. I'll then want to display that data on the website.Sun bear
yeah probably not worth rewriting this in javascript
i think the architecture you described would be suitable
although you gotta watch out and make sure next.js does not cache the website with old data that never refreshes
you'll need to look into either having your script call a webhook to revalidate, doing it on a schedule with ISR, or eating the cost of dynamically refreshing each request if you don't care about caching
SphynxOP
the nature of the data is unlikely to require reparsing the logs for each request
Sun bear
you wont need to worry about this until you host because it depends on how youre planning to host your next.js app
but just a headsup since its your first time that sometimes next behaves differently when deployed due to the caching behaviour