Enhancements in PyCaret 3.0: Key Features Unveiled

Introduction

PyCaret is a low-code, open-source machine learning library in Python designed to streamline machine learning processes. It serves as a comprehensive tool for managing models and automating workflows, significantly accelerating the experimental cycle and enhancing productivity.

Unlike many other open-source libraries, PyCaret offers a low-code approach that can condense extensive lines of code into just a few, allowing for faster and more efficient experimentation. This Python package integrates various libraries and frameworks, simplifying the machine learning process.

The design of PyCaret is influenced by the growing presence of citizen data scientists—individuals who analyze data and derive insights without formal training in data science or statistics.

1. Time Series Module

The launch of PyCaret 3.0 marks the stabilization of the Time Series module, now available for users. This module is specifically tailored for time series analysis, which involves data collected over intervals.

The Time Series module in PyCaret 3.0 excels at forecasting tasks, providing an easy-to-use interface that allows users of all skill levels to perform forecasting operations with minimal coding.

Future enhancements will enable the module to support time-series anomaly detection—identifying unusual patterns in time-series data—and time-series clustering, which groups similar data points based on their time-series behavior.

# Load dataset

from pycaret.datasets import get_data

data = get_data('airline')

# Initialize setup

from pycaret.time_series import *

s = setup(data, fh=12, session_id=123)

# Compare models

best = compare_models()

# Forecast plot

plot_model(best, plot='forecast')

# Forecast plot 36 days into the future

plot_model(best, plot='forecast', data_kwargs={'fh': 36})

2. Object-Oriented API

PyCaret has established its value in the data science community. However, it does not utilize conventional object-oriented programming practices commonly adopted by Python developers. Consequently, we revisited some foundational design elements from the initial PyCaret 1.0 release.

This transition to a new object-oriented API will demand significant effort but is essential for aligning with Python's best programming practices, ensuring PyCaret remains a dependable tool for data scientists.

This modification will enhance accessibility for a broader audience, facilitating smoother integration with other Python libraries and frameworks, ultimately enabling more efficient data science workflows.

# Functional API (Existing)

# Load dataset

from pycaret.datasets import get_data

data = get_data('juice')

# Initialize setup

from pycaret.classification import *

s = setup(data, target='Purchase', session_id=123)

# Compare models

best = compare_models()

Conducting experiments within the same notebook is convenient, but differing setup parameters can lead to complications, as previous configurations may be overwritten.

With the new object-oriented API, users can easily manage multiple experiments in a single notebook without conflicts, as parameters are tied to specific objects linked to various modeling and preprocessing choices.

# Load dataset

from pycaret.datasets import get_data

data = get_data('juice')

# Setup experiment 1

from pycaret.classification import ClassificationExperiment

exp1 = ClassificationExperiment()

exp1.setup(data, target='Purchase', session_id=123)

# Compare models for experiment 1

best = exp1.compare_models()

# Setup experiment 2

exp2 = ClassificationExperiment()

exp2.setup(data, target='Purchase', normalize=True, session_id=123)

# Compare models for experiment 2

best2 = exp2.compare_models()

After completing experiments, the get_leaderboard function can be used to generate leaderboards for each experiment, facilitating easier comparisons.

import pandas as pd

# Generate leaderboard

leaderboard_exp1 = exp1.get_leaderboard()

leaderboard_exp2 = exp2.get_leaderboard()

lb = pd.concat([leaderboard_exp1, leaderboard_exp2])

# Print pipeline steps

print(exp1.pipeline.steps)

print(exp2.pipeline.steps)

3. Experiment Logging

In PyCaret 2, experiment logging with MLflow was automated and the default method. However, PyCaret 3 introduces an expanded array of logging options. The new version now supports wandb, cometml, and dagshub alongside MLflow.

Switching from the default MLflow logger to any of the newly introduced options is easy. Simply specify the desired logging choice as a parameter in the log_experiment function. Available options include mlflow, wandb, cometml, and dagshub.

This upgrade in logging capabilities enhances user flexibility in tracking and managing machine learning experiments, allowing data scientists to choose the tools that best fit their requirements.

Liked the blog? Connect with Moez Ali

Moez Ali is a forward-thinking innovator and technologist. Transitioning from data scientist to product manager, he is committed to developing cutting-edge data products and fostering vibrant open-source communities.

As the creator of PyCaret, he has authored over 100 publications with more than 500 citations and is recognized globally for his contributions to open-source projects in Python.

Let’s be friends! Connect with me:

LinkedIn
Twitter
Medium
YouTube

Check out my personal website: https://www.moez.ai.

To learn more about my open-source endeavors, explore the PyCaret GitHub repository or follow PyCaret’s official LinkedIn page.

Listen to my talk on Time Series Forecasting with PyCaret at the DATA+AI SUMMIT 2022 by Databricks.

My Most Read Articles:

Machine Learning in Power BI using PyCaret

A step-by-step tutorial for implementing machine learning in Power BI within minutes

[towardsdatascience.com](https://towardsdatascience.com)
Announcing PyCaret 2.0

An open-source low-code machine learning library in Python

[towardsdatascience.com](https://towardsdatascience.com)
Time Series Forecasting with PyCaret Regression Module

A step-by-step tutorial for time-series forecasting using PyCaret

[towardsdatascience.com](https://towardsdatascience.com)
Multiple Time Series Forecasting with PyCaret

A step-by-step tutorial on forecasting multiple time series using PyCaret

[towardsdatascience.com](https://towardsdatascience.com)
Time Series Anomaly Detection with PyCaret

A step-by-step tutorial on unsupervised anomaly detection for time series data using PyCaret

[towardsdatascience.co](https://towardsdatascience.co)

Subscribe to DDIntel Here. Visit our website here: https://www.datadriveninvestor.com Join our network here: https://datadriveninvestor.com/collaborate

zgtangqian.com

Enhancements in PyCaret 3.0: Key Features Unveiled

Introduction

1. Time Series Module

2. Object-Oriented API

3. Experiment Logging

Liked the blog? Connect with Moez Ali

Let’s be friends! Connect with me:

Share the page:

Recent Post:

Exploring Mindset Mastery and the Journey to Self-Improvement

# Tragic Downfall: The Grant Amato Story and Its Dark Consequences

The Myth of Common Sense: Why It’s Not So Common After All

Steve Ballmer's Surprising Thoughts on Selling the Clippers

Discovering the Universe's Largest Black Holes

# Key Qualities Startup CEOs Must Have to Secure Funding

Rediscovering Passion: A Medium Writer's Journey Back to Creativity

Transform Your Life by Altering Your Memories and Mindset