scikit-learn

Also known as scikits.learn, sklearn, scikit

scikit-learn (formerly scikits.learn and also known as sklearn) is a free and open-source machine learning library for the Python programming language. It features various classification, regression and clustering algorithms including support-vector machines, random forests, gradient boosting, k-means and DBSCAN, and is designed to interoperate with the Python numerical and scientific libraries NumPy and SciPy. Scikit-learn is a NumFOCUS fiscally sponsored project.

Key facts

Software.name: scikit-learn
Software.logo: Scikit learn logo small.svg
Software.author: David Cournapeau
Software.developer: Google Summer of Code project
Software.programming language: Python, Cython, C and C++
Software.operating system: Linux, macOS, Windows
Software.genre: Library for machine learning
Software.license: New BSD License

via Wikipedia infobox

Source code

github.com →

scikit-learn is a Python module for machine learning built on top of SciPy and is distributed under the 3-Clause BSD license. The project was started in 2007 by David Cournapeau as a Google Summer of Code project, and since then many volunteers have contributed. See the About us page for a list of core contributors. We welcome new contributors of all experience levels. The scikit-learn community goals are to be helpful, welcoming, and effective. The Development Guide has detailed information about contributing code, documentation, tests, and more. We've included some basic information in this README. Random number generation can be controlled during testing by setting the SKLEARN SEED environment variable. The project was started in 2007 by David Cournapeau as a Google Summer of Code project, and since then many volunteers have contributed. See the About us page for a list of core contributors.

~8 min read

Article

14 sections

Contents

Overview
Features
Examples
Implementation
History
Applications
Finance and Insurance
Retail and E-Commerce
Media, Marketing, and Social Platforms
Technology
Academia
Awards
References
External links

==Overview== The scikit-learn project started as scikits.learn, a Google Summer of Code project by French data scientist David Cournapeau. The name of the project derives from its role as a "scientific toolkit for machine learning", originally developed and distributed as a third-party extension to SciPy. The original codebase was later rewritten by other developers. In 2010, contributors Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort and Vincent Michel, from the French Institute for Research in Computer Science and Automation in Saclay, France, took leadership of the project and released the first public version of the library on February 1, 2010. In November 2012, scikit-learn as well as scikit-image were described as two of the "well-maintained and popular" . In 2019, it was noted that scikit-learn is one of the most popular machine learning libraries on GitHub. At that time, the project had over 1,400 contributors and the documentation received 42 million visits in 2018. According to a 2022 Kaggle survey of nearly 24,000 respondents from 173 countries, scikit-learn was identified as the most widely used machine learning framework.