Skip to content

SCIKIT_LEARN_DATASET

Retrieve a pandas DataFrame from the scikit-learn sample datasets. Params: dataset_name : str Returns: out : DataFrame A DataContainer object containing the retrieved pandas DataFrame.
Python Code
from typing import Literal

from flojoy import DataFrame, flojoy


@flojoy()
def SCIKIT_LEARN_DATASET(
    dataset_name: Literal[
        "iris", "diabetes", "digits", "linnerud", "wine", "breast_cancer"
    ] = "iris",
) -> DataFrame:
    """Retrieve a pandas DataFrame from the scikit-learn sample datasets.

    Parameters
    ----------
    dataset_name : str

    Returns
    -------
    DataFrame
        A DataContainer object containing the retrieved pandas DataFrame.
    """

    if dataset_name == "iris":
        from sklearn.datasets import load_iris

        iris = load_iris(as_frame=True, return_X_y=True)
        return DataFrame(df=iris[0])  # type: ignore

    elif dataset_name == "diabetes":
        from sklearn.datasets import load_diabetes

        iris = load_diabetes(as_frame=True, return_X_y=True)
        return DataFrame(df=iris[0])  # type: ignore

    elif dataset_name == "digits":
        from sklearn.datasets import load_digits

        iris = load_digits(as_frame=True, return_X_y=True)
        return DataFrame(df=iris[0])  # type: ignore

    elif dataset_name == "linnerud":
        from sklearn.datasets import load_linnerud

        iris = load_linnerud(as_frame=True, return_X_y=True)
        return DataFrame(df=iris[0])  # type: ignore

    elif dataset_name == "wine":
        from sklearn.datasets import load_wine

        iris = load_wine(as_frame=True, return_X_y=True)
        return DataFrame(df=iris[0])  # type: ignore

    elif dataset_name == "breast_cancer":
        from sklearn.datasets import load_breast_cancer

        iris = load_breast_cancer(as_frame=True, return_X_y=True)
        return DataFrame(df=iris[0])  # type: ignore

    else:
        raise ValueError(f"Failed to retrieve '{dataset_name}' from rdatasets package!")

Find this Flojoy Block on GitHub

Example

Having problems with this example app? Join our Discord community and we will help you out!
React Flow mini map

The SCIKIT_LEARN_DATASET app

The workflow of this app is described below:

SCIKIT_LEARN_DATASET : This is a SCIKIT_LEARN_DATASET node. It takes one parameter dataset_name, the name of dataset to load from sklearn.datasets package. In this case it is ‘iris’ which is default value of this parameter. It passing a DataFrame object of DataContainer class to the next node Table.

TABLE: This node creates a Plotly table visualization for a given input DataFrame object of DataContainer class.