Do you know any Sites for Data Sets 💻

Aside from Kaggle… Are there any other sites for obtaining Datasets.

If someone know, then please tell us.

1 Like

Hi Vishal,
Yep, there are tons of sites!
Many cities (worldwide I would assume) have open data portals where you can grab city (or state data in the case of the US) data in a .csv or a json file.
Ex: NYC Open Data
https://opendata.cityofnewyork.us

The U.S. Census also has a TON of data.
https://www.census.gov

I’ve downloaded data from both. There’s lots of clean up involved with Census data, but I just do that in Excel or with something like Open Refine.

Anything you’re interested in–gov, art, sports, etc. probably has online repositories for their data. Because I’m a baseball nerd, I’ve also gotten data from Baseball Reference dot com. (They’re an example of a site that forbids scraping. They will ban your IP if you do scrape. All their data is available for copying/pasting as a csv file.)

Oh, and I recently realized that the NY Met Museum allows ppl to scrape data from their site. I want to do that soon.

If you know how to build a web scraper, you can grab data from websites that way too. BUT, read the site’s guidelines/rules about scraping data from their sites. Many don’t allow it and others have very stringent rules. Here are some guidelines.
https://medium.com/velotio-perspectives/web-scraping-introduction-best-practices-caveats-9cbf4acc8d0f

1 Like

Thankyou soo much @lisalisaj for replying instantly. Yes Beautiful Soup is the next Lesson i’m moving towards to and will keep that in mind. Thankyou for the Link. :raised_hands::taco:

1 Like

Maybe these will be of some interest too?

https://www.reddit.com/r/datasets/

https://dataportals.org/about

2 Likes

Thankyou @toastedpitabread for providing links to these sites. Didn’t knew the Reddit also provides Datasets.

Reddit, discord, and irc are surprisingly deep in terms of resources. It’s good to keep an eye out and see what they have to offer.

Ooooo didn’t knew about that. Thanx @toastedpitabread

3 Likes

Woahhhhh Even NASA has Dataset :open_mouth:

And why not making your own dataset!

Have you heard of this?

You can download the Science Journal app (both Android and iOS) and collect data from your microphone (decibel levels) or from the gyros and accelerometers and then export those data points for analysis!!

Example project: How about collecting noise levels at different places like your living room, a subway station, a restaurant, etc and trying to use K-Means to try and identify each location!

2 Likes

Thankyou very much for sharing this information

1 Like