Posts by Category: Data Mining Resources

Data Mining and Knowledge Discovery

June 09, 2018

Data Mining and Knowledge Discovery
https://link.springer.com/journal/10618

The premier technical publication in the field, Data Mining and Knowledge Discovery is a resource collecting relevant common methods and techniques and a forum for unifying the diverse constituent research communities. The journal publishes original technical papers in both the research and practice of data mining and knowledge discovery, surveys and tutorials of important areas and techniques, and detailed descriptions of significant applications. Coverage includes: a) Theory and Foundational Issues; b) Data Mining Methods; c) Algorithms for Data Mining; d) Knowledge Discovery Process; and e) Application Issues. This will be added to Data Mining Resources Subject Tracer™. This will be added to Knowledge Discovery Subject Tracer™. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™.

42 views

EU Open Data Portal

May 19, 2018

EU Open Data Portal
https://data.europa.eu/euodp/en/home

The European Union Open Data Portal (EU ODP) gives you access to open data published by EU institutions and bodies. All the data you can find via this catalogue are free to use and reuse for commercial or non-commercial purposes. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Data Mining Resources Subject Tracer™. This will be added to Bot Research Subject Tracer™. This will be added to Web Data Extractors White Paper. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Deep Web Research and Discovery Resources Subject Tracer™.

72 views

UC Irvine Machine Learning Repository

May 19, 2018

UC Irvine Machine Learning Repository
https://archive.ics.uci.edu/ml/index.php

They currently maintain 425 data sets as a service to the machine learning community. You may view all data sets through our searchable interface. Their old web site is still available, for those who prefer the old format. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Data Mining Resources Subject Tracer™. This will be added to Bot Research Subject Tracer™. This will be added to Web Data Extractors White Paper. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Deep Web Research and Discovery Resources Subject Tracer™.

55 views

Kickstarter Datasets

May 18, 2018

Kickstarter Datasets
https://webrobots.io/kickstarter-datasets/

They have a scraper robot which crawls all Kickstarter projects and collects data in JSON format. From March 2016 they run this data crawl once a month. Datasets are available through last month. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Data Mining Resources Subject Tracer™. This will be added to Bot Research Subject Tracer™. This will be added to Web Data Extractors White Paper. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Deep Web Research and Discovery Resources Subject Tracer™.

49 views

Indiegogo Datasets

May 18, 2018

Indiegogo Datasets
https://webrobots.io/indiegogo-dataset/

They have a scraper robot which crawls Indiegogo projects and collects data about them. This robot was launched in May 2016 and they run a crawl once a month. First dataset contains data about 91.5k projects. This will be added to Statistics Resources and Big Data Subject Tracer™. This will be added to Data Mining Resources Subject Tracer™. This will be added to Bot Research Subject Tracer™. This will be added to Web Data Extractors White Paper. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Deep Web Research and Discovery Resources Subject Tracer™.

55 views

Overview – Open Source Document Mining

May 14, 2018

Overview – Open Source Document Mining
https://blog.overviewdocs.com/

Overview is a document mining application originally built for investigative journalists. It’s also used for legal work, training machine learning models, and research of all types. It’s a visualization and analysis tool designed for sets of documents, from dozens to millions of pages of material. Overview imports many formats and languages, includes built-in OCR, a sophisticated search engine, document annotation, word clouds, entity detection, and topic-based document clustering. It has tagging and metadata support and supports many input and export formats. If you need custom analysis, you can write your own plugins using the API. This will be added to Journalism Resources Subject Tracer™. This will be added to Data Mining Resources Subject Tracer™. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™.

70 views

Webhose.io – Turn Unstructured Web Content Into Machine-Readable Data Feeds That You Can Consume On Demand

May 12, 2018

Webhose.io – Turn Unstructured Web Content Into Machine-Readable Data Feeds That You Can Consume On Demand
https://webhose.io/

They provide on-demand access to web data feeds anyone can consume. Webhose.io empowers you to build, launch, and scale data-driven operations as you grow — whether you’re an entrepreneur, a researcher, or a senior executive at a Fortune 500 company. Developers get free access to the same web data feeds that power our growing customer base of global media analytics and monitoring leaders. Every web data feed is optimized to deliver up-to-the-minute coverage of a specific content domain, such as news, blogs, online discussions, and more. Just define your filters so you can focus on what you do best. Webhose.io is the brainchild of Ran Geva and Guy Mor, two entrepreneurs with extensive experience in technology, data mining, and product development who set up to build a simple solution for a complicated problem for anyone who wants to consume data from the web. This will be added to Data Mining Resources Subject Tracer™. This will be added to Web Data Extractors White Paper. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™.

83 views

Awareness Watch Talk Show for Wednesday May 2, 2018 at 2:00pm EDT – Data Mining Resources 2018

May 02, 2018

Awareness Watch Talk Show for Wednesday May 2, 2018 at 2:00pm EDT – Data Mining Resources 2018
http://www.BlogTalkRadio.com/AwarenessWatch/

This program will be featuring my just recently updated white paper Data Mining Resources 2018/. We will be highlighting the latest and greatest resources and sources covering data mining resources, sources, search engines, subject directories, articles, guides and tracers….literally everything on the Internet covering DATA MINING!! We will also be discussing my latest freely available Awareness Watch Newsletter V16N5 May 2018 featuring 2018 New Economy Resources and Tools as well as my freely available May 2018 Zillman Columns highlighting eCommerce Resources on the Internet 2018. You may call in to ask your questions at (718)508-9839. The show is live and thirty minutes in length starting at 2:00pm EDT on Wednesday, May 2, 2018 and then archived for easy review and access. Listen, Call and Enjoy….

76 views

Tanagra Project – Free Data Mining Software for Academic and Research Purposes

April 28, 2018

Tanagra Project – Free Data Mining Software for Academic and Research Purposes
http://eric.univ-lyon2.fr/~ricco/tanagra/en/tanagra.html

TANAGRA is a free DATA MINING software for academic and research purposes. It proposes several data mining methods from exploratory data analysis, statistical learning, machine learning and databases area. This project is the successor of SIPINA which implements various supervised learning algorithms, especially an interactive and visual construction of decision trees. TANAGRA is more powerful, it contains some supervised learning but also other paradigms such as clustering, factorial analysis, parametric and nonparametric statistics, association rule, feature selection and construction algorithms… TANAGRA is an “open source project” as every researcher can access to the source code, and add his own algorithms, as far as he agrees and conforms to the software distribution license. The main purpose of Tanagra project is to give researchers and students an easy-to-use data mining software, conforming to the present norms of the software development in this domain (especially in the design of its GUI and the way to use it), and allowing to analyse either real or synthetic data. The second purpose of TANAGRA is to propose to researchers an architecture allowing them to easily add their own data mining methods, to compare their performances. TANAGRA acts more as an experimental platform in order to let them go to the essential of their work, dispensing them to deal with the unpleasant part in the programmation of this kind of tools : the data management. The third and last purpose, in direction of novice developers, consists in diffusing a possible methodology for building this kind of software. They should take advantage of free access to source code, to look how this sort of software is built, the problems to avoid, the main steps of the project, and which tools and code libraries to use for. In this way, Tanagra can be considered as a pedagogical tool for learning programming techniques. TANAGRA does not include, presently, what makes all the strength of the commercial softwares in this domain : a wide set of data sources, direct access to datawarehouses and databases, data cleansing, interactive utilization, … This will be added to Data Miing Resources Subject Tracer™. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™.

82 views

Jaspersoft® ETL – The Open Source Data Integration Platform

April 24, 2018

Jaspersoft® ETL – The Open Source Data Integration Platform
https://community.jaspersoft.com/project/jaspersoft-etl

Jaspersoft ETL is easy to deploy and out-performs many proprietary and open source ETL systems. It is used to extract data from your transactional system to create a consolidated data warehouse or data mart for reporting and analysis. This will be added to Web Data Extractors white paper. This will be added to Data Mining Resources Subject Tracer™. This will be added to Business Intelligence Resources Subject Tracer™. This will be added to Entrepreneurial Resources Subject Tracer™. This will be added to the tools section of Research Resources Subject Tracer™. This will be added to Start Up Resources for the Entrepreneur 2018 white paper.

 

63 views