Next.js Discord

Discord Forum

How to decrease loading time.....

Unanswered
Dwarf Crocodile posted this in #help-forum
Open in Discord
Avatar
Dwarf CrocodileOP
Hey guys, I am new to AWS and stuff.

Right now, i have 10 .csv files of 20-50MB each on S3.

I have a dropdown menu in frontend, I have to select anyone of these csv. And then i am fetching it's data in frontend.

Some .csv files are heavy, it takes almost 10sec to load.

Any suggestions?

134 Replies

Avatar
one solution for big data is: using databases. So when you want to work with AWS, use dynamoDB. If you don't want to handle dynamoDB, you can use graphql with aws amplify. Pretty useful if you ask me. Like that your data is there in less than 500ms
Avatar
Dwarf CrocodileOP
Actually, we have to select a protein from dropdown.
And for each protein, there is specific .csv file.
That file contains the data of that protein.

Since proteins are made up of amino acids, which are made up of Nitrogens, carbons, etc..

So csv files contains all those atoms, there respective postion, respective branches, and probabilites of mutation amino acid, and many other stuff.

So its quite large file.

And we are not fetching same attributes everytime from those csv files, thats why we always need to load the complete csv
in future there might be more dropdown options, hence more csv files.
Is storing all these in a database better option than storing them in S3 / cloudfront?
Avatar
Komondor
yes
Since you are having to fetch the entire file, parse it, and only return some attributes from it, you are essentially querying the file. It'd be much faster to query a database
Avatar
Dwarf CrocodileOP
umm this makes sense
which services are cheaper though:
S3+cloudfront
OR
dynamoDB
OR
graphQL + amplify
Also,
these csv files have so much data, is there any way of transferring the content from csv to database? or do I have to do it manually?
Avatar
Komondor
yes sql databases will support csv import for sure, since csvs are already in row format
Avatar
Dwarf CrocodileOP
and which one to use? dynamoDB or something else?
Avatar
Komondor
dynamoDB is a key value store, a noSQL database
Avatar
Dwarf CrocodileOP
according to you...what would be the best option, considering the rates of the services
Avatar
Komondor
sql database for sure
i'm not familiar with amplify, and S3+cloudfront isn't sql
Avatar
databases are waaaay more effective. So in almost every case you will be cheaper with a database
Avatar
Dwarf CrocodileOP
I am already using sqlite as database to store user credential details and stuff.
so should i add those csv files data in it somewhere?
Avatar
Giant Angora
You can use MySQL or Postgres
Avatar
Komondor
yes sqlite should work for you
Avatar
Dwarf CrocodileOP
will I face any performance issues if i do it on sqlite? considering that in future, there will be more csv files (a typical csv file is about 25MB)
Avatar
Komondor
how many MB are you fetching in a single request?
Avatar
Dwarf CrocodileOP
it depends on what option user is selecting
there are csv files ranging from 5-50MB
and i am basically querying those csv files, where my required data is, i find it and then show it on frontend
actually, i dont want to give full access to my csv files...
otherwise i would have gave them to download the csv and they could compute things on there on
but that data in csv was generated by our ML models, and we dont want to make it public
Avatar
Komondor
right the file is 5-50MB. You can compare that to a SQL table. But you're not returning the entire file's worth of data in a single request are you?
So you'd query the database table (file), and only retgurn some of it
any sql database can do this with ease
Avatar
Dwarf CrocodileOP
so any sql db will save me time as compared to s3
any idea how fast would that be?
Avatar
yes, it will be extremely fast compared to your current method
Avatar
Serbian Hound
@Dwarf Crocodile if you index properly it will be significantly faster
Avatar
Dwarf CrocodileOP
for csv files..there are like 10k rows
Avatar
Serbian Hound
probably dont even need to index specific fields, it'll just be much faster
yes, db is designed exactly for that
Avatar
Dwarf CrocodileOP
any tutorials on how to do it?
Avatar
Serbian Hound
flatfile csv isnt made for querying
you have no sql/mysql knowledge?
Avatar
Dwarf CrocodileOP
not enough
i just know how to make schema and store stuff
Avatar
Serbian Hound
thats all you need
and a query
just make it basic, it will be fast enough im sure
Avatar
Dwarf CrocodileOP
idk anyting about indexing
Avatar
Serbian Hound
well even if you have an incremental id as your primary key index
and thats it
it'll still be way faster
Avatar
Dwarf CrocodileOP
and what is redis?
for caching n stuff?
Avatar
Serbian Hound
thats a usecase for redis yeah , think of redis as an in memory solution
so your sessions and stuff like that you can use redis
your data sounds relational, so i would definitely suggest sql db solution
Avatar
Dwarf CrocodileOP
no need for caching?
Avatar
Serbian Hound
no not unless you see a need
and if you do, you dont need redis just to cache
you can cache it on your server
for example nextJS caches fetch endpoints anyway iirc
Avatar
Dwarf CrocodileOP
i am using react n flask
n sqlite db
Avatar
Serbian Hound
you should convert that csv to sql
and store it in your db
query it and see how much faster iti s
Avatar
Dwarf CrocodileOP
hmm
ok
will try
Avatar
Serbian Hound
not the same thing but i have an app that generates PDFs on the fly
previously i was generating them each time, because i thought they're small
but now i store them in db
generating and serving took like a second, db lookup is instant
like a few ms
Avatar
Dwarf CrocodileOP
ohhh
thats reallly good
Avatar
Serbian Hound
yeah trust me when u have large data
database is the best
Avatar
Dwarf CrocodileOP
okk
and i just need to make a separate table for each csv file
Avatar
Serbian Hound
yeah and think of the relationships between each table
doesnt have to be anything special
Avatar
Dwarf CrocodileOP
umm
ok
thnx 🫡
Avatar
Serbian Hound
np
gl noob let me know how it goes
Avatar
Dwarf CrocodileOP
i will ask you guys doubts, if I get any while working on this
Avatar
Serbian Hound
👍
Avatar
Dwarf CrocodileOP
what u do btw?
Avatar
Serbian Hound
wym
for work?
im a frontend dev
but i like backend also lol
Avatar
Dwarf CrocodileOP
good
hey, 1 more doubt.
2-3 of my csv files have about 20k rows
ain't that wayy too much?
Avatar
Serbian Hound
no wonder its slow af lol
you have to remember
Avatar
Dwarf CrocodileOP
hmm
Avatar
Serbian Hound
this is exactly what db is designed for
20k is not a lot by db standards
Avatar
Dwarf CrocodileOP
ohh
Avatar
Serbian Hound
but for csv it definitely is
Avatar
Dwarf CrocodileOP
so querying a 100 row db table take similar time as querying a 10k row db table?
i meant, querying 10k will still take under 1 sec?
Avatar
Serbian Hound
depends if you're querying an index, in which case yes, but in most cases you're not doing that
but it wont be slow
yeah it should be very very fast
Avatar
Dwarf CrocodileOP
cauz now, it takes about 7-10sec
hmm
cool
Avatar
Serbian Hound
everytime you're querying it?
thats insane
sql will be much faster
Avatar
Dwarf CrocodileOP
Got it
Avatar
Dwarf CrocodileOP
should def. use it then
Avatar
Serbian Hound
you have someone here handling 500 million on a pretty old pc
Avatar
Dwarf CrocodileOP
waow
like most scalable leaderboard systum uses redis...to show ranking in realtime
idk for what reason they use redis
Avatar
Serbian Hound
redis is useful because its in memory
Avatar
Dwarf CrocodileOP
but yeah...they also handle pretty large data
i dont get it... "in memory"
Avatar
Serbian Hound
a leaderboard constantly changes and needs computation, your data is .csv files so it wont even change
in memory = RAM
if something is a file its not being stored on your RAM its on your hard drive
you get it?
Avatar
Dwarf CrocodileOP
yes yes
ohh
Avatar
Serbian Hound
like when you are on google chrome, the tabs you're on, they're "in memory"
Avatar
Dwarf CrocodileOP
hmm....chrome tabs take up a lot of my RAM 😦
hmmm
Avatar
Serbian Hound
haha lol same