ORCID is a nonprofit helping create a world in which all who participate in research, scholarship and innovation are uniquely identified and connected to their contributions and affiliations, across disciplines, borders, and time. ORCID provides a persistent digital identifier that distinguishes you from every other researcher and, through integration in key research workflows such as manuscript and grant submission, supports automated linkages between you and your professional activities ensuring that your work is recognized. ORCID is a non-profit organization supported by a global community of organizational members, including research organizations, publishers, funders, professional associations, and other stakeholders in the research ecosystem. This will be added to the tools section of Research Resources Subject Tracer™.
Meedan builds digital tools for global journalism and translation. We are a team of designers, technologists and journalists who focus on open source investigation of digital media and crowdsourced translation of social media. With commercial, media and university partners, we support research, curriculum development, and new forms of digital storytelling. This will be added to Journalism Resources Subject Tracer™.
Machine learning algorithms can be divided into 3 broad categories — supervised learning, unsupervised learning, and reinforcement learning.Supervised learning is useful in cases where a property (label) is available for a certain dataset (training set), but is missing and needs to be predicted for other instances. Unsupervised learning is useful in cases where the challenge is to discover implicit relationships in a given unlabeled dataset (items are not pre-assigned). Reinforcement learning falls between these 2 extremes — there is some form of feedback available for each predictive step or action, but no precise label or error message. Since this is an intro class, they didn’t learn about reinforcement learning, but they hope that 10 algorithms on supervised and unsupervised learning will be enough to keep you interested. This will be added to Artificial Intelligence Resources Subject Tracer™.
ELKI is an open source (AGPLv3) data mining software written in Java. The focus of ELKI is research in algorithms, with an emphasis on unsupervised methods in cluster analysis and outlier detection. In order to achieve high performance and scalability, ELKI offers data index structures such as the R*-tree that can provide major performance gains. ELKI is designed to be easy to extend for researchers and students in this domain, and welcomes contributions of additional methods. ELKI aims at providing a large collection of highly parameterizable algorithms, in order to allow easy and fair evaluation and benchmarking of algorithms. This will be added to Data Mining Resources Subject Tracer™.
MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text. MALLET includes sophisticated tools for document classification: efficient routines for converting text to “features”, a wide variety of algorithms (including Naïve Bayes, Maximum Entropy, and Decision Trees), and code for evaluating classifier performance using several commonly used metrics. In addition to classification, MALLET includes tools for sequence tagging for applications such as named-entity extraction from text. Algorithms include Hidden Markov Models, Maximum Entropy Markov Models, and Conditional Random Fields. These methods are implemented in an extensible system for finite state transducers. This will be added to Data Mining Resources Subject Tracer™. This will be added to Artificial Intelligence Resources Subject Tracer™.
Deeplearning4j is the first commercial-grade, open-source, distributed deep-learning library written for Java and Scala. Integrated with Hadoop and Spark, DL4J is designed to be used in business environments on distributed GPUs and CPUs. Deeplearning4j aims to be cutting-edge plug and play, more convention than configuration, which allows for fast prototyping for non-researchers. DL4J is customizable at scale. Released under the Apache 2.0 license, all derivatives of DL4J belong to their authors. DL4J can import neural net models from most major frameworks via Keras, including TensorFlow, Caffe, Torch and Theano, bridging the gap between the Python ecosystem and the JVM with a cross-team toolkit for data scientists, data engineers and DevOps. This will be added to Data Mining Resources Subject Tracer™. This will be added to Artificial Intelligence Resources Subject Tracer™.
MOA is the most popular open source framework for data stream mining, with a very active growing community (blog). It includes a collection of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommender systems) and tools for evaluation. Related to the WEKA project, MOA is also written in Java, while scaling to more demanding problems. This will be added to Data Mining Resources Subject Tracer™. This will be added to Artificial Intelligence Resources Subject Tracer™.
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes. Found only on the islands of New Zealand, the Weka is a flightless bird with an inquisitive nature. The name is pronounced like this, and the bird sounds like this. Weka is open source software issued under the GNU General Public License. They have put together several free online courses that teach machine learning and data mining using Weka. Check out the website for the courses for details on when and how to enroll. The videos for the courses are available on Youtube. Yes, it is possible to apply Weka to big data! This will be added to Data Mining Resources Subject Tracer™. This will be added to Artificial Intelligence Resources Subject Tracer™.
The annotated white paper titled “Searching the Internet 2017 – A Primer” by Marcus P. Zillman, M.S., A.M.H.A. has been updated and is a primer for those new to searching the Internet or for experienced searchers always looking for new and innovative search sources .. both are all included in this primer!! It is freely available as a 19 page .pdf document (394KB) from the above link from the Virtual Private Library™. Other white papers are available by clicking here. [Updated with all links validated and new links added on: July 15, 2017]