Data Mining: Practical Machine Learning Tools and TechniquesElsevier, 3 févr. 2011 - 664 pages Data Mining: Practical Machine Learning Tools and Techniques, Third Edition, offers a thorough grounding in machine learning concepts as well as practical advice on applying machine learning tools and techniques in real-world data mining situations. This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining. Thorough updates reflect the technical changes and modernizations that have taken place in the field since the last edition, including new material on Data Transformations, Ensemble Learning, Massive Data Sets, Multi-instance Learning, plus a new version of the popular Weka machine learning software developed by the authors. Witten, Frank, and Hall include both tried-and-true techniques of today as well as methods at the leading edge of contemporary research. The book is targeted at information systems practitioners, programmers, consultants, developers, information technology managers, specification writers, data analysts, data modelers, database R&D professionals, data warehouse engineers, data mining professionals. The book will also be useful for professors and students of upper-level undergraduate and graduate-level data mining and machine learning courses who want to incorporate data mining as part of their data management knowledge base and expertise. - Provides a thorough grounding in machine learning concepts as well as practical advice on applying the tools and techniques to your data mining projects - Offers concrete tips and techniques for performance improvement that work by transforming the input or output in machine learning methods - Includes downloadable Weka software toolkit, a collection of machine learning algorithms for data mining tasks—in an updated, interactive interface. Algorithms in toolkit cover: data pre-processing, classification, regression, clustering, association rules, visualization |
Expressions et termes fréquents
accuracy applied ARFF association rules attribute selection attribute values Bayesian Bayesian networks choose class value classifier clustering computing consider cross-validation data mining dataset decision tree default discretization distribution documents ensemble evaluation example false Figure filter function humidity hyperrectangle implementation input instance-based instance-based learning interface Iris setosa Iris versicolor Iris virginica item sets iterations kD-trees kernel leaf learning algorithm learning scheme linear models linear regression logistic regression machine learning maximum-margin hyperplane minimum missing values model trees multi-instance multilayer perceptrons Naïve Bayes node nominal attributes normal number of instances numeric attributes object editor options outlook output overcast overfitting parameter perceptron performance probability estimates problem pruning random result sample simple specified split statistical structure subset subtree sunny support vector machines Table techniques temperature test instance test set text mining training data training instances training set two-class Visualize weather data weight Weka windy yes yes
