Posts by Category: Statistics Resources and Big Data

Awareness Watch Talk Show for Wednesday October 4, 2017 at 2:00pm EDST – Statistics Resources and Big Data 2018

October 04, 2017

Awareness Watch Talk Show for Wednesday October 4, 2017 at 2:00pm EDST – Statistics Resources and Big Data 2018
http://www.BlogTalkRadio.com/AwarenessWatch/

This program will be featuring my white paper Statistics Resources and Big Data 2018. We will be highlighting the latest and greatest resources and sources covering search engines, subject directories, articles, guides and tracers….literally everything on the Internet covering STATISTICS RESOURCES AND BIG DATA!! We will also discussing my latest freely available Awareness Watch Newsletter V15N10 October 2017 featuring 2018 Directory of Directories as well as my freely available October 2017 Zillman Columns highlighting Statistics Resources and Big Data 2018. You may call in to ask your questions at (718)508-9839. The show is live and thirty minutes in length starting at 2:00pm EDST on Wednesday, October 4, 2017 and then archived for easy review and access. Listen, Call and Enjoy….

88 views

DBpedia – Crowd-Sourced Community Effort To Extract Structured Information from Wikipedia

September 27, 2017

DBpedia – Crowd-Sourced Community Effort To Extract Structured Information from Wikipedia
http://wiki.dbpedia.org/

DBpedia is a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data. We hope that this work will make it easier for the huge amount of information in Wikipedia to be used in some new interesting ways. Furthermore, it might inspire new mechanisms for navigating, linking, and improving the encyclopedia itself. The DBpedia project leverages this gigantic source of knowledge by extracting structured information from Wikipedia and by making this information accessible on the Web under the terms of the Creative Commons Attribution-ShareAlike 3.0 License and the GNU Free Documentation License. The English version of the DBpedia knowledge base describes 4.58 million things, out of which 4.22 million are classified in a consistent ontology, including 1,445,000 persons, 735,000 places (including 478,000 populated places), 411,000 creative works (including 123,000 music albums, 87,000 films and 19,000 video games), 241,000 organizations (including 58,000 companies and 49,000 educational institutions), 251,000 species and 6,000 diseases. In addition, we provide localized versions of DBpedia in 125 languages. All these versions together describe 38.3 million things, out of which 23.8 million are localized descriptions of things that also exist in the English version of DBpedia. The full DBpedia data set features 38 million labels and abstracts in 125 different languages, 25.2 million links to images and 29.8 million links to external web pages; 80.9 million links to Wikipedia categories, and 41.2 million links to YAGO categories. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™.

74 views

ReDash – Make Your Company Data Driven

September 19, 2017

ReDash – Make Your Company Data Driven
https://redash.io/

Redash is an open source tool for teams to query, visualize and collaborate. Redash is quick to setup and works with any data source you might need so you can query from anywhere in no time. Share your results and dashboards with other team members and empower your entire organization to be data driven with no-code filters and parameters that instantly adjust. Get alerts for pre-defined triggers to your email, Slack or Hipchat (you can setup a custom webhook as well). Redash is our take on freeing the data within our company in a way that will better fit our culture and usage patterns. We tried to use traditional BI suites and discovered a set of bloated, technically challenged and slow tools/flows. What we were looking for was a more hacker’ish way to look at data, so we built one. Redash was built to allow fast and easy access to billions of records, that we process and collect using Amazon Redshift (“petabyte scale data warehouse” that “speaks” PostgreSQL). Today Redash has support for querying multiple databases, including: Redshift, Google BigQuery,Google Spreadsheets, PostgreSQL, MySQL, Graphite, Axibase Time Series Database and custom scripts. Main Features include: 1) Query editor – enjoy all the latest standards like auto-complete and snippets. Share both your results and queries to support an open and data driven approach within the organization; 2) Visualization – once you have your dataset, select one of our /9 types of visualizations/ for your query. You can also export or embed it anywhere; 3) Dashboard – combine several visualizations into a topic targeted dashboard; 4) Alerts – get notified via email, Slack, Hipchat or a webhook when your query’s results need attention; and 5) API – anything you can do with the UI, you can do with the API. Easily connect results to other systems or automate your workflows. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Statistics Resources and Big Data Subject Tracer™.

79 views

Updated> Statistics Resources and Big Data 2018 White Paper Dataset Link Compilation

September 18, 2017

Updated> Statistics Resources and Big Data 2018 White Paper Dataset Link Compilation
http://www.StatisticsResources.com/

I have just updated my white paper dataset link compilation for Statistics Resources and Big Data 2018 Subject Tracer™ by Marcus P. Zillman, M.S., A.M.H.A. It is now a 34 page .pdf document 278KB. [Completely updated with all links validated and new URLs added on September 11, 2017] Other white papers are available by clicking here.

This research is powered by Subject Tracer Bots™ from the Virtual Private Library™. Isn't yours?

77 views

Trifacta – Data Wrangling

August 18, 2017

Trifacta – Data Wrangling
https://www.trifacta.com/

Trifacta’s mission is to create radical productivity for people who analyze data. They are deeply focused on solving for the biggest bottleneck in the data lifecycle, data wrangling, by making it more intuitive and efficient for anyone who works with data. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™.

163 views

Kaggle – Home of Data Science and Machine Learning

August 07, 2017

Kaggle – Home of Data Science and Machine Learning
https://www.kaggle.com/

Kaggle helps you learn, work and play. Features include: a) Competitions – Climb the world’s most elite machine learning leader boards, b) Datasets – Explore and analyze a collection of high quality public datasets, and c) Kernels – Run code in the cloud and receive community feedback on your work. This will be added to Artificial Intelligence Resources Subject Tracer™. This will be added to Script Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Statistics Resources and Big Data Subject Tracer™.

126 views

The Magazine of Early American Datasets

July 24, 2017

The Magazine of Early American Datasets
http://repository.upenn.edu/mead

The University of Pennsylvania Libraries offers the Magazines of Early American Datasets (MEAD), a collection of datasets for researchers of early American history. The datasets are collected from organizations such as the American Antiquarian Society as well as from individual scholars. Visitors can download datasets in whatever format their original authors used or as comma separated variables (.csv). Each entry also includes a codebook, allowing researchers to use this data with ease. One highlight of this collection is two early nineteenth century admissions books from the Eastern State Penitentiary, transcribed by Scott Ziegler of the American Philosophical Society and Michelle Ziogas of Drexel University. This dataset includes intake notes on each incarcerated individual. For example, the notes on a Philadelphia blacksmith charged with burglary reads: “Seems like an old convict & very insensible. No wish to intercourse with me on religious subjects.” Other data sets include 19th Century American Children’s Book Trade Directory (courtesy of the American Antiquarian Society); a collection of George Washington’s shipping invoices; and the 1790 Census of black individuals living in Philadelphia. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Reference Resources Subject Tracer™. Copyright © 2017 Internet Scout Research Group – http://scout.wisc.edu

101 views

DataPortals.org

July 24, 2017

DataPortals.org
http://www.DataPortals.org/

DataPortals.org is the most comprehensive list of open data portals in the world. It is curated by a group of leading open data experts from around the world – including representatives from local, regional and national governments, international organizations such as the World Bank, and numerous NGOs. This will be added to Open Datasets Subject Tracer™. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Deep Web Research and Discovery Resources.

118 views

Enigma Public – World’s Broadest Collection of Public Data

July 17, 2017

Enigma Public – World’s Broadest Collection of Public Data
https://public.enigma.com/

Enigma is an operational data management and intelligence company. They believe in curiosity and the power of discovery. Our mission is to empower people to interpret and improve the world around them. To deliver on that ambitious goal, we place data into the context of the real world and make it connected, open, and actionable. Our repository of public data informs and trains each of our enterprise offerings. Enigma Public is the world’s broadest collection of public data. Take a tour to see everything you can do in the Public platform. This will be added to Open Datasets Subject Tracer™. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to Business Intelligence Resources Subject Tracer™.

111 views

US Government Web Services and XML Data Sources

June 13, 2017

US Government Web Services and XML Data Sources
http://usgovxml.com/

USGovXML.com is an index to publicly available web services and XML data sources that are provided by the US government. USGovXML.com indexes data sources from all 3 branches of government as well as its boards, commissions, corporations and independent agencies. This will be added to Open Datasets Subject Tracer™. This will be added to Deep Web Research and Discovery Resources 2017. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to New Economy Resources 2017.

186 views