Next.js Discord

Discord Forum

Getting different responses! 🤔

Answered
Dwarf Crocodile posted this in #help-forum
Open in Discord
Dwarf CrocodileOP
So, earlier, I was doing data fetching from S3 directly on frontend:

const params = {
        Bucket: "front-end1",
        Key: selectedOption,
      };

const data = await s3.getObject(params).promise();

console.log("ensemble_data----", data);


And then I would do parsing or other processing of this 'data'
(I have fetched the raw data here I guess, I am using "data.Body" later in data processing like:
const workbook = XLSX.read(data.Body, { type: "array" });

)

But, to make it more secure, I moved my data fetching from S3 to backend.

Flask API endpoint:
@app.route('/api/ensemble/<selectedOption>', methods=['GET'])
@jwt_required()
def get_ensemble_data(selectedOption):
    current_user = get_jwt_identity()
    if current_user['role'] not in ['admin', 'client', 'employee']:
        return jsonify({'message': 'Unauthorized'}), 403

    try:
        response = s3.get_object(Bucket="front-end1", Key=selectedOption)
        file_content = response['Body'].read()

         return Response({"ensembleData": file_content})
    except Exception as e:
        return jsonify({"error": str(e)}), 500


Frontend Code:
const response = await axios.get(
          `${baseUrl}/api/ensemble/${selectedOption}`,
          {
            headers: {
              Authorization: `Bearer ${localStorage.getItem("token")}`,
            },
          }
        );
        const data = response.data.ensembleData;
        console.log("ensemble_data----", data);


And then I would do exact same parsing or other processing of this 'data' too.

But the thing is, these "data" values are not same. I am attaching photos of console logs for both.

1st pic: Old Code
2nd pic: New Code
Answered by B33fb0n3
you shouldn't store data inside an excel file and parse and do whatever to retrieve it. s3 is a service to store files. Not data. It's also not the place to serve it efficiently. Of course you can do all this, but you will either get bugs (like you see now) or you get a huge bill (you might see this in the future).

Store data where it want to be stored and that's inside a database. This data can be efficiently created, read, updated and deleted (CRUD).

So create a database and put your data inside there. Then use a ORM like drizzle to do CRUD operations.
AWS itself has a service for databases as well ("Amazon RDS"): https://aws.amazon.com/de/rds/postgresql/

It can also be seamless integrated to drizzle: https://orm.drizzle.team/docs/connect-aws-data-api-pg
View full answer

16 Replies

@Dwarf Crocodile So, earlier, I was doing data fetching from S3 directly on frontend: javascript const params = { Bucket: "front-end1", Key: selectedOption, }; const data = await s3.getObject(params).promise(); console.log("ensemble_data----", data); And then I would do parsing or other processing of this 'data' (I have fetched the raw data here I guess, I am using "data.Body" later in data processing like: javascript const workbook = XLSX.read(data.Body, { type: "array" }); ) But, to make it more secure, I moved my data fetching from S3 to backend. Flask API endpoint: python @app.route('/api/ensemble/<selectedOption>', methods=['GET']) @jwt_required() def get_ensemble_data(selectedOption): current_user = get_jwt_identity() if current_user['role'] not in ['admin', 'client', 'employee']: return jsonify({'message': 'Unauthorized'}), 403 try: response = s3.get_object(Bucket="front-end1", Key=selectedOption) file_content = response['Body'].read() return Response({"ensembleData": file_content}) except Exception as e: return jsonify({"error": str(e)}), 500 Frontend Code: javascript const response = await axios.get( `${baseUrl}/api/ensemble/${selectedOption}`, { headers: { Authorization: `Bearer ${localStorage.getItem("token")}`, }, } ); const data = response.data.ensembleData; console.log("ensemble_data----", data); And then I would do exact same parsing or other processing of this 'data' too. But the thing is, these "data" values are not same. I am attaching photos of console logs for both. 1st pic: Old Code 2nd pic: New Code
you shouldn't store data inside an excel file and parse and do whatever to retrieve it. s3 is a service to store files. Not data. It's also not the place to serve it efficiently. Of course you can do all this, but you will either get bugs (like you see now) or you get a huge bill (you might see this in the future).

Store data where it want to be stored and that's inside a database. This data can be efficiently created, read, updated and deleted (CRUD).

So create a database and put your data inside there. Then use a ORM like drizzle to do CRUD operations.
AWS itself has a service for databases as well ("Amazon RDS"): https://aws.amazon.com/de/rds/postgresql/

It can also be seamless integrated to drizzle: https://orm.drizzle.team/docs/connect-aws-data-api-pg
Answer
Dwarf CrocodileOP
actually, i am not storing the data, there are files already present.
And there are a total of 50+ files. Each very long.
But have same format.
They are basically in .csv and .xlsx format.

So, i am fetching specific file based on user's selected option in frontend.
And then I am mapping them and doing other processing.

So i need those files to be on S3 not in a DB
@Dwarf Crocodile actually, i am not storing the data, there are files already present. And there are a total of 50+ files. Each very long. But have same format. They are basically in .csv and .xlsx format. So, i am fetching specific file based on user's selected option in frontend. And then I am mapping them and doing other processing. So i need those files to be on S3 not in a DB
well... data is data. So get the data out of your excels and into a DB. That's the only solution I want to give you and yes, I am talking from experience about those problems, that you will face:
Of course you can do all this, but you will either get bugs (like you see now) or you get a huge bill (you might see this in the future)
@Dwarf Crocodile So, earlier, I was doing data fetching from S3 directly on frontend: javascript const params = { Bucket: "front-end1", Key: selectedOption, }; const data = await s3.getObject(params).promise(); console.log("ensemble_data----", data); And then I would do parsing or other processing of this 'data' (I have fetched the raw data here I guess, I am using "data.Body" later in data processing like: javascript const workbook = XLSX.read(data.Body, { type: "array" }); ) But, to make it more secure, I moved my data fetching from S3 to backend. Flask API endpoint: python @app.route('/api/ensemble/<selectedOption>', methods=['GET']) @jwt_required() def get_ensemble_data(selectedOption): current_user = get_jwt_identity() if current_user['role'] not in ['admin', 'client', 'employee']: return jsonify({'message': 'Unauthorized'}), 403 try: response = s3.get_object(Bucket="front-end1", Key=selectedOption) file_content = response['Body'].read() return Response({"ensembleData": file_content}) except Exception as e: return jsonify({"error": str(e)}), 500 Frontend Code: javascript const response = await axios.get( `${baseUrl}/api/ensemble/${selectedOption}`, { headers: { Authorization: `Bearer ${localStorage.getItem("token")}`, }, } ); const data = response.data.ensembleData; console.log("ensemble_data----", data); And then I would do exact same parsing or other processing of this 'data' too. But the thing is, these "data" values are not same. I am attaching photos of console logs for both. 1st pic: Old Code 2nd pic: New Code
Dwarf CrocodileOP
But, for the time being, can you help whats causing this problem?
@Dwarf Crocodile Ok, will try to transfer the data to DB, but probably next month. Right now, many more files are being generated in sagemaker and transfering them to S3. Once that is done, will try to transfer.
that sounds great. I am pretty sure, that sagemaker also offers a way to directly export it to your RDS database as both are services from aws
and I think i will be able to read and process/parse the data much faster if the data is in DATABASE than fetching it from S3
and maybe use cloudfront to make it even more fast?
@Dwarf Crocodile wow, are there any tutorials or docs that I can follow?
I just checked the docs and it doesn't look like you can export it to RDS 😦
@Dwarf Crocodile and maybe use cloudfront to make it even more fast?
you are right: cloudfront is a CDN and will serve your data. When directly serving it from the origin (your s3) it can get very expensive
@B33fb0n3 you are right: cloudfront is a CDN and will serve your data. When directly serving it from the origin (your s3) it can get very expensive
Dwarf CrocodileOP
My webpage takes a bit of time when it first loads a protein structure.

The thing is:
Every protein has a .csv, .pdb and some other .xlsx files on S3.
A user selects a protein from dropdown, and then I fetch all the files of that protein.

These are large files (for some bigger proteins, they are about 5k+ lines).

Protein structures are generated by data coming from .pdb files.
Atom level summaries when a user clicks on specific atom in 3d structure is coming from .csv file. And so on.

How can I improve my website's latency? and overall performance and speed?
@Dwarf Crocodile solved?