Generate Bulk Test data now up to 100TB using tpc-ds kit for big data analysis
Step 1: Don't go to the root
Step 2: sudo apt-get update
Run this command to update Linux dependencies.
Step 3: sudo apt-get install gcc make flex bison byacc git
Now, installing some libraries named gcc, make, flex, bison, byacc, and git.
Step 4: git clone https://github.com/gregrahn/tpcds-kit.git
Cloning Github repository
Step 5: cd tpcds-kit/tools
Moving to tpcds-kit/tools directory.
Step 6: make OS=LINUX
Last but not least, generating datasets from Github according to OS version.
Step 7: ./dsdgen -scale 5 -force
Lastly, this command will allow you to generate 5 GB of test data including 24 .dat extension files.
You can generate up to 100TB of test data just by changing the scale value in the above command. The below table shows Row counts per scale factor.
No comments:
Post a Comment
Thank you for submitting your comment! We appreciate your feedback and will review it as soon as possible. Please note that all comments are moderated and may take some time to appear on the site. We ask that you please keep your comments respectful and refrain from using offensive language or making personal attacks. Thank you for contributing to the conversation!