Data Mining Resources 2018

Data Mining Resources 2018

[Download Data Mining Resources 2018 White Paper Link Dataset Compilation}

Data Mining Resources ( is a Subject Tracer™ Information Blog developed and created by the Virtual Private Library™. It is designed to bring together the latest resources and sources on an ongoing basis from the Internet for artificial intelligence resources which are listed below. We always welcome suggestions of additional sites and resources to be added to this comprehensive listing and please submit by clicking here. This site has been developed and maintained by Marcus P. Zillman, M.S., A.M.H.A.; Internet expert, author, keynote speaker, and consultant. His latest white papers include Searching the Internet, Academic and Scholar Search Engines and Sources, and Knowledge Discovery Resources 2017. All of his Subject Tracer™ Information Blogs and his white papers are available from His latest monthly column is available by clicking here. Subscribe to his free monthly Awareness Watch™ Newsletter. Learn more by clicking here.


80legs – Custom Web Crawlers for Crawling and Processing Web Content

ACM SIGKDD: Current Explorations Issue

Apache Pig – Platform for Analyzing Large Datasets                                                                                                                                                                                                                                                                                      

ARTstor – Digital Image Library for Education and Scholarship

AZMY Thinkware — Data Analysis and Mining Software Tools

Benchmarking- Data Mining Benchmarking Association

Bibliomining for Automated Collection Development in a Digital Library Setting: Using Data Mining to Discover Web-Based Scholarly Research Works by Dr. Scott Nicholson

BI-DW – Business Intelligence and Data Warehousing Directory

Biomedical LIterature (and text)Mining Publications (BLIMP)

Bixo – Open Source Web Mining Toolkit

BLIASoft Knowledge Discovery

Bot Research

Business Intelligence Resources

CCSU – Data Mining

Center for Automated Learning and Discovery

ChartSearch – Intelligent Data Search

Chronicling America – Library of Congress – National Digital Newspaper Program

COREMINE Medical – Biomedical Mindmap                                                                                                                                                                                                                                                                          

Current Awareness Discovery Tools on the Internet

DataMelt – Computation and Visualization Environment

D2K – Data to Knowledge

Data Engineering Bulletin

DataFerrett – Data Mining Tool

Data Fountains: Open Source Internet Resource Discovery and Metadata/Full-Text Generation Service

Data Mining

Data Mining and Analytic Technologies

Data Mining and KDD Papers

Data Mining and Knowledge Discovery Journal

Data Mining for the Masses

Data Mining – Federal Efforts Cover a Wide Range of Uses Report

DataMiningGrid Consortium

Data Mining Group (DMG)

Data Mining, Predictive Modeling, Business Analytics: Training, Consulting & Solutions

Data Mining Resources

Data Mining Resources

Data Mining Resources at CCSU

Data Mining: Technology and Policy The DHS Privacy Office

Data Mining: Text Mining, Visualization and Social Media

Data Mining, Web Scraping, Web Mining, Data Extraction and Screen Scraping Technology Links

Data Mining, Web Mining, and Business Intelligence Solutions from Salford Systems

Datanami – Big Data, Big Analytics, and Big Insights                                                                                                                                                                                                                                                                          


Data Science Toolkit

Data Shaping Data Mining Resources

Data Sources

DbVisualizer – The Universal Database Tool

DeepDive – Analyze Data On a Deeper Level Than Ever Before

Deep Web Research Resources

Digital Library for Earth System Education (DLESE)

Directory of Data Warehouse, Data Mining, and Decision Support Resources

DiscoverText – Capture Text Data and Crunch Your Data

Easy PDF Cloud

eBiquity Research Group Blogger

Early Canadiana Online

Elastic Web Mining Talk

Enterprise Semantic Intelligence™ Knowledge Suite

Everything You Wanted to Know About Data Mining but Were Afraid to Ask by Alexander Furnas

Exclusive Ore, Inc.

FACTA+ – Finding Associated Concepts with Text Analysis

Four-T-Nine-R(sm): Data Mining in Web and non-Web Bibliographic Databases


Google Refine 2.0 – Power Tool for Data Wranglers                                                                                                                                                                                                                                                               

Graf-FX – Visual Database Data Mining Software                                                                                                                                          

Great War Primary Documents Archive

History of Data Mining by Raymond Li

Howard D. Wactlar Home Page

Imagination Engines

InfoBionics – Flexible Data Mining Applications

Information Retrieval (IR) and Information Extraction (IE) on the Web Using Hypertext Meta-Data and Structure

Information Retrieval Intelligence

InfoVis CyberInfrastructure

International Journal of Business Intelligence and Data Mining (IJBIDM)

International Journal of Data Mining and Bioinformatics (IJDMB)

International Journal of Data Warehousing and Mining (IJDWM)

Internet Archive

Inter-university Consortium for Political and Social Research (ICPSR)

Journal of Data Mining and Knowledge Discovery

Junar – The Open Data Platform

Kaggle – Go from Big Data to Big Analytics                                                                                                                                                                                                                                                                                                                   










KDnuggets: Data Mining, Web Mining, and Knowledge Discovery Guide

KEEL (Knowledge Extraction based on Evolutionary Learning)

KNIME – Konstanz Information Miner Open Source Software                                                                                                                                                                                                                                                             

Knowledge Discovery Resources

Knowledge Discovery Resources 2014 Annotated White Paper Link Compilation by Marcus P. Zillman, M.S., A.M.H.A.

KnowleSys – Web Public Opinion Monitoring

LingPipe – Information Extraction and Data Mining Tools

LLRX – A Review of TRACFed: Lawyers Strike Gold Mining Government Data

LLRX – Deep Web Research and Discovery Resources 2013                                                                                                                                                                                                                                        

LoginWorks – Advanced Solutions – Data Mining and Web Scraping

Marcus P. Zillman Home Page

Marriott Library at the University of Utah Digital Collections

Marti Hearst Home Page

Media Patterns – Detecting Patterns in the Global Media Content

Megaputer – Data Mining and Text Mining Software

Microsoft® Data Mining Project – Efficient Data Exploration and Modeling

MineKnowledge – Revealing Your Data’s Secrets                                                                                                                                                                                                                                                                                 
MOA (Massive Online Analysis)

MoData – Big Data Resources

MonetDB Query Processing at Light Speed

Mozenda – Data Extraction and Comprehensive Web Data Gathering

National Archives, London

National Centre for Text Mining (NaCTeM)

National Science Digital Library (NSDL)

National Technical Information Service (NTIS)

Nebraska Digital Newspaper Project

Nesstar – Publish Data on the Web

NetOwl – Entity Extraction and Entity Analytics for Big Data

New York Public Library

Nuix – eDiscovery and Electronic Investigation Software

Oceanstore Project

OntoMiner: Bootstrapping and Populating Ontologies From Domain Specific Web Sites

Open Directory Project – Data Mining

Opening History (OH) – U.S. History Resources from Libraries, Museums, and Archives

Open/Public Data Sources

Open Source Data Mining Tools

Open Source Data Warehousing, Big Data Analytics

Orange – Open Source Data Visualization and Analysis for Novice and Experts

PC AI Magazine Artificial Intelligence

Pentaho BI Project – Open Source Business Intelligence

PEPITe S.A. – Unlock Your Knowledge

Prediction Markets

Predictive Model Markup Language (PMML)- Project Info

Predictive Model Markup Language (PMML)

PubChase – Discover Biomedical Research of Interest To You                                                                                                                                                                                                                                                 

Pudget – Science at Speed

QDA Miner Lite (Freeware)

QL2 Software – Unstructured Data Management and Web Mining Software

QueryTree – Explore Data Without Code

Raghu Ramakrishnan Home Page

RapidMiner – Open Source Data Mining Tool

Rattle – Data Mining Toolkit in R


Rexer Analytics – Analytic and CRM Consulting

Ron Kohavi Home Page

Samepoint – Reputation Management Social Media Search

SAS – Data and Text Mining

SCaVis – Scientific Computation and Visualization Environment

Scholarly Database at the Cyberinfrastructure for Network Science Center, Indiana University

Screen-Scraper – Data Extraction Software and Services

Searching the Internet

Semantic Scholar – Free Scientific Literature Search and Discovery

SIGKDD – ACM Special Interest Group – Knowledge Discovery in Data and Data Mining

Smithsonian/NASA Astrophysics Data System (ADS)

Social Buzz Bot – Business Intelligence Data Mining for Information Discovery from Social Communities                                                            

Software Suites for Data Mining, Analytics, and Knowledge Discovery

Special Interest Group – Knowledge Discovery in Data and Data Mining – SIGKDD Explorations Newsletter

SPMF – Open Source Data Mining Library

SQL Server Data Mining

Statistical Analysis and Data Mining

Statistical Data Mining Tutorials – Tutorial Slides by Andrew Moore

Statoo Statistical Consulting + Data Analysis + Data Mining

Survey of DHS Data Mining Activities – Office of Information Technology

Talend Open Data Solutions

Texifter – Search, Sift, Sort & Classify Documents

Text Data Mining

Text Mining for Scholarly Communications and Repositories

Text Mining, Web Mining, Information Retrieval and Extraction from the WWW References

The Archaeology Data Service (ADS)

The Centre for Contemporary Canadian Art – Canadian Art Database Project

The Data Mine

The History Data Service (HDS)

The National Centre for Text Mining: Aims and Objectives by Sophia Ananiadou, Julia Chruszcz, John Keane, John McNaught and Paul Watry

The New York Times Article Search API

The Open Access Digital Library

Togaware – Data Mining Resources

Topic Detection and Tracking (TDT)

T-Rex (Trainable Relation Extraction)

Truthy – Analyze and Visualize the Diffusion of Information on Twitter

Unit Miner – Web Data Extraction Software

University of Florida Digital Collections (UFDC)

University of North Texas Digital Collections

Using the Internet As a Dynamic Resource Tool for Knowledge Discovery

Vendor-Neutral Public Courses on Data Mining Strategy, Methods & Practice

VisitorVille – Web Site Intelligence

Visual Analytics from Raytheon

Web Curator Tool (WCT)- Management of Selective Web Harvesting Process

Web Data Extractors – White Paper Link Compilation – Farming the Web for Systematic Business Intelligence

Web-Harvest – Open Source Web Data Extraction Tool written in Java

Web Harvesting by Russell Kay

Webzeitgeist – Design Mining the Web

Weka 3: Data Mining Software in Java

Weka 3 – Data Mining with Open Source Machine Learning Software in Java

White Papers by Marcus P. Zillman, M.S., A.M.H.A.

WizSoft – Data and Text Mining

Yahoo Groups – Data Mining

(c)2017 Marcus P. Zillman, M.S., A.M.H.A.

Sign up for Awareness Watch 2018

* = required field

Browse Categories