Posts by Category: Data Mining Resources

Kickstarter Datasets

May 18, 2018

Kickstarter Datasets
https://webrobots.io/kickstarter-datasets/

They have a scraper robot which crawls all Kickstarter projects and collects data in JSON format. From March 2016 they run this data crawl once a month. Datasets are available through last month. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Data Mining Resources Subject Tracer™. This will be added to Bot Research Subject Tracer™. This will be added to Web Data Extractors White Paper. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Deep Web Research and Discovery Resources Subject Tracer™.

209 views

Indiegogo Datasets

May 18, 2018

Indiegogo Datasets
https://webrobots.io/indiegogo-dataset/

They have a scraper robot which crawls Indiegogo projects and collects data about them. This robot was launched in May 2016 and they run a crawl once a month. First dataset contains data about 91.5k projects. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Data Mining Resources Subject Tracer™. This will be added to Bot Research Subject Tracer™. This will be added to Web Data Extractors White Paper. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Deep Web Research and Discovery Resources Subject Tracer™.

302 views

Overview – Open Source Document Mining

May 14, 2018

Overview – Open Source Document Mining
https://blog.overviewdocs.com/

Overview is a document mining application originally built for investigative journalists. It’s also used for legal work, training machine learning models, and research of all types. It’s a visualization and analysis tool designed for sets of documents, from dozens to millions of pages of material. Overview imports many formats and languages, includes built-in OCR, a sophisticated search engine, document annotation, word clouds, entity detection, and topic-based document clustering. It has tagging and metadata support and supports many input and export formats. If you need custom analysis, you can write your own plugins using the API. This will be added to Journalism Resources Subject Tracer™. This will be added to Data Mining Resources Subject Tracer™. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™.

164 views

Webhose.io – Turn Unstructured Web Content Into Machine-Readable Data Feeds That You Can Consume On Demand

May 12, 2018

Webhose.io – Turn Unstructured Web Content Into Machine-Readable Data Feeds That You Can Consume On Demand
https://webhose.io/

They provide on-demand access to web data feeds anyone can consume. Webhose.io empowers you to build, launch, and scale data-driven operations as you grow — whether you’re an entrepreneur, a researcher, or a senior executive at a Fortune 500 company. Developers get free access to the same web data feeds that power our growing customer base of global media analytics and monitoring leaders. Every web data feed is optimized to deliver up-to-the-minute coverage of a specific content domain, such as news, blogs, online discussions, and more. Just define your filters so you can focus on what you do best. Webhose.io is the brainchild of Ran Geva and Guy Mor, two entrepreneurs with extensive experience in technology, data mining, and product development who set up to build a simple solution for a complicated problem for anyone who wants to consume data from the web. This will be added to Data Mining Resources Subject Tracer™. This will be added to Web Data Extractors White Paper. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™.

190 views

Awareness Watch Talk Show for Wednesday May 2, 2018 at 2:00pm EDT – Data Mining Resources 2018

May 02, 2018

Awareness Watch Talk Show for Wednesday May 2, 2018 at 2:00pm EDT – Data Mining Resources 2018
http://www.BlogTalkRadio.com/AwarenessWatch/

This program will be featuring my just recently updated white paper Data Mining Resources 2018/. We will be highlighting the latest and greatest resources and sources covering data mining resources, sources, search engines, subject directories, articles, guides and tracers….literally everything on the Internet covering DATA MINING!! We will also be discussing my latest freely available Awareness Watch Newsletter V16N5 May 2018 featuring 2018 New Economy Resources and Tools as well as my freely available May 2018 Zillman Columns highlighting eCommerce Resources on the Internet 2018. You may call in to ask your questions at (718)508-9839. The show is live and thirty minutes in length starting at 2:00pm EDT on Wednesday, May 2, 2018 and then archived for easy review and access. Listen, Call and Enjoy….

244 views

Tanagra Project – Free Data Mining Software for Academic and Research Purposes

April 28, 2018

Tanagra Project – Free Data Mining Software for Academic and Research Purposes
http://eric.univ-lyon2.fr/~ricco/tanagra/en/tanagra.html

TANAGRA is a free DATA MINING software for academic and research purposes. It proposes several data mining methods from exploratory data analysis, statistical learning, machine learning and databases area. This project is the successor of SIPINA which implements various supervised learning algorithms, especially an interactive and visual construction of decision trees. TANAGRA is more powerful, it contains some supervised learning but also other paradigms such as clustering, factorial analysis, parametric and nonparametric statistics, association rule, feature selection and construction algorithms… TANAGRA is an “open source project” as every researcher can access to the source code, and add his own algorithms, as far as he agrees and conforms to the software distribution license. The main purpose of Tanagra project is to give researchers and students an easy-to-use data mining software, conforming to the present norms of the software development in this domain (especially in the design of its GUI and the way to use it), and allowing to analyse either real or synthetic data. The second purpose of TANAGRA is to propose to researchers an architecture allowing them to easily add their own data mining methods, to compare their performances. TANAGRA acts more as an experimental platform in order to let them go to the essential of their work, dispensing them to deal with the unpleasant part in the programmation of this kind of tools : the data management. The third and last purpose, in direction of novice developers, consists in diffusing a possible methodology for building this kind of software. They should take advantage of free access to source code, to look how this sort of software is built, the problems to avoid, the main steps of the project, and which tools and code libraries to use for. In this way, Tanagra can be considered as a pedagogical tool for learning programming techniques. TANAGRA does not include, presently, what makes all the strength of the commercial softwares in this domain : a wide set of data sources, direct access to datawarehouses and databases, data cleansing, interactive utilization, … This will be added to Data Miing Resources Subject Tracer™. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™.

184 views

Jaspersoft® ETL – The Open Source Data Integration Platform

April 24, 2018

Jaspersoft® ETL – The Open Source Data Integration Platform
https://community.jaspersoft.com/project/jaspersoft-etl

Jaspersoft ETL is easy to deploy and out-performs many proprietary and open source ETL systems. It is used to extract data from your transactional system to create a consolidated data warehouse or data mart for reporting and analysis. This will be added to Web Data Extractors white paper. This will be added to Data Mining Resources Subject Tracer™. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Start Up Resources for the Entrepreneur 2018 white paper.

 

140 views

OpenMinted – Text and Data Mining the Next Data Frontier

April 24, 2018

OpenMinted – Text and Data Mining the Next Data Frontier
http://openminted.eu/

An infrastructural approach to make sense of large volumes of scientific content. OpenMinTeD builds an online platform where researchers can find text mining tools and services from around the world. The platform will also contain an overview of openly available text and data sources, ready to be mined. To build a successful online platform, they are looking for researchers and organizations who would like to make their text mining tools or services available for discovery on our platform. To make it even more interesting for you to register your tools or services with OpenMinTeD, they are now working on a way to use different text mining tools and services in combination with each other. This will allow researchers to use the best breed of text mining components for specific tasks, and will bring about new and exciting text and data mining discoveries. On the OpenMinTeD platform, they are planning to produce the following services, so that researchers can easily access your text mining tools and services and start mining: a) Α registry of all available text mining tools and services; b) Interoperability guidelines to describe text and data mining services as resources, their accompanying licenses, policies and operational details, and common protocols to operate and be deployed in a local or a cloud context; c) An annotation service which implements the interoperability specifications for annotations to showcase common representations and protocols; and d) A workflow service to help researchers mix and match with different text mining services. This will be added to Data Mining Resources Subject Tracer™. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Start Up Resources for the Entrepreneur 2018 white paper.

175 views

Updated> Data Mining Resources 2018 Whitepaper Dataset Link Compilation

April 10, 2018

Updated> Data Mining Resources 2018 Whitepaper Dataset Link Compilation
http://www.DataMiningResources.info/

I have just updated my Data Mining Resources 2018 Subject Tracer™ Whitepaper Dataset Link Compilation and it is now a 34 page (291KB) .pdf white paper document is available from the above URL link. It lists alphabetically the latest resources and sources for data mining available from the Internet.[Completely updated with all links validated and new URLs added on April 10, 2018] Additional white papers and resources by Marcus P. Zillman are available by clicking here.

265 views

Datasets for Data Mining and Data Science

April 07, 2018

Datasets for Data Mining and Data Science
https://www.kdnuggets.com/datasets/index.html

A very comprehensive listing of datasets for data mining and data science. This will be added to Data Mining Resources Subject Tracer™. This will be added to New Economy Resources Subject Tracer™. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Deep Web Research and Discovery Resources. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™.

247 views