doubt.datasets.gas_turbine

Gas turbine data set.

This data set is from the UCI data set archive, with the description being the original description verbatim. Some feature names may have been altered, based on the description.

  1"""Gas turbine data set.
  2
  3This data set is from the UCI data set archive, with the description being the original
  4description verbatim. Some feature names may have been altered, based on the
  5description.
  6"""
  7
  8import io
  9import zipfile
 10
 11import pandas as pd
 12
 13from .dataset import BASE_DATASET_DESCRIPTION, BaseDataset
 14
 15
 16class GasTurbine(BaseDataset):
 17    __doc__ = f"""
 18    Data have been generated from a sophisticated simulator of a Gas Turbines (GT),
 19    mounted on a Frigate characterized by a COmbined Diesel eLectric And Gas (CODLAG)
 20    propulsion plant type.
 21
 22    The experiments have been carried out by means of a numerical simulator of a naval
 23    vessel (Frigate) characterized by a Gas Turbine (GT) propulsion plant. The
 24    different blocks forming the complete simulator (Propeller, Hull, GT, Gear Box and
 25    Controller) have been developed and fine tuned over the year on several similar
 26    real propulsion plants. In view of these observations the available data are in
 27    agreement with a possible real vessel.
 28
 29    In this release of the simulator it is also possible to take into account the
 30    performance decay over time of the GT components such as GT compressor and
 31    turbines.
 32
 33    The propulsion system behaviour has been described with this parameters:
 34
 35        - Ship speed (linear function of the lever position lp).
 36        - Compressor degradation coefficient kMc.
 37        - Turbine degradation coefficient kMt.
 38
 39    so that each possible degradation state can be described by a combination of this
 40    triple (lp,kMt,kMc).
 41
 42    The range of decay of compressor and turbine has been sampled with an uniform grid
 43    of precision 0.001 so to have a good granularity of representation.
 44
 45    In particular for the compressor decay state discretization the kMc coefficient has
 46    been investigated in the domain [1; 0.95], and the turbine coefficient in the
 47    domain [1; 0.975].
 48
 49    Ship speed has been investigated sampling the range of feasible speed from 3 knots
 50    to 27 knots with a granularity of representation equal to tree knots.
 51
 52    A series of measures (16 features) which indirectly represents of the state of the
 53    system subject to performance decay has been acquired and stored in the dataset
 54    over the parameter's space.
 55
 56    {BASE_DATASET_DESCRIPTION}
 57
 58    Features:
 59        lever_position (float)
 60            The position of the lever
 61        ship_speed (float):
 62            The ship speed, in knots
 63        shaft_torque (float):
 64            The shaft torque of the gas turbine, in kN m
 65        turbine_revolution_rate (float):
 66            The gas turbine rate of revolutions, in rpm
 67        generator_revolution_rate (float):
 68            The gas generator rate of revolutions, in rpm
 69        starboard_propeller_torque (float):
 70            The torque of the starboard propeller, in kN
 71        port_propeller_torque (float):
 72            The torque of the port propeller, in kN
 73        turbine_exit_temp (float):
 74            Height pressure turbine exit temperature, in celcius
 75        inlet_temp (float):
 76            Gas turbine compressor inlet air temperature, in celcius
 77        outlet_temp (float):
 78            Gas turbine compressor outlet air temperature, in celcius
 79        turbine_exit_pres (float):
 80            Height pressure turbine exit pressure, in bar
 81        inlet_pres (float):
 82            Gas turbine compressor inlet air pressure, in bar
 83        outlet_pres (float):
 84            Gas turbine compressor outlet air pressure, in bar
 85        exhaust_pres (float):
 86            Gas turbine exhaust gas pressure, in bar
 87        turbine_injection_control (float):
 88            Turbine injection control, in percent
 89        fuel_flow (float):
 90            Fuel flow, in kg/s
 91
 92    Targets:
 93        compressor_decay (type):
 94            Gas turbine compressor decay state coefficient
 95        turbine_decay (type):
 96            Gas turbine decay state coefficient
 97
 98    Source:
 99        https://archive.ics.uci.edu/ml/datasets/Condition+Based+Maintenance+of+Naval+Propulsion+Plants
100
101    Examples:
102        Load in the data set::
103
104            >>> dataset = GasTurbine()
105            >>> dataset.shape
106            (11934, 18)
107
108        Split the data set into features and targets, as NumPy arrays::
109
110            >>> X, y = dataset.split()
111            >>> X.shape, y.shape
112            ((11934, 16), (11934, 2))
113
114        Perform a train/test split, also outputting NumPy arrays::
115
116            >>> train_test_split = dataset.split(test_size=0.2, random_seed=42)
117            >>> X_train, X_test, y_train, y_test = train_test_split
118            >>> X_train.shape, y_train.shape, X_test.shape, y_test.shape
119            ((9516, 16), (9516, 2), (2418, 16), (2418, 2))
120
121        Output the underlying Pandas DataFrame::
122
123            >>> df = dataset.to_pandas()
124            >>> type(df)
125            <class 'pandas.core.frame.DataFrame'>
126    """
127
128    _url = (
129        "https://archive.ics.uci.edu/ml/machine-learning-databases/"
130        "00316/UCI%20CBM%20Dataset.zip"
131    )
132
133    _features = range(16)
134    _targets = [16, 17]
135
136    def _prep_data(self, data: bytes) -> pd.DataFrame:
137        """Prepare the data set.
138
139        Args:
140            data (bytes): The raw data
141
142        Returns:
143            Pandas dataframe: The prepared data
144        """
145        # Convert the bytes into a file-like object
146        buffer = io.BytesIO(data)
147
148        # Unzip the file and pull out the txt file
149        with zipfile.ZipFile(buffer, "r") as zip_file:
150            txt_bytes = zip_file.read("UCI CBM Dataset/data.txt")
151
152        # Decode text and replace initial space on each line
153        txt = txt_bytes[3:].decode("utf-8").replace("\n   ", "\n")
154
155        # Convert the remaining triple spaces into commas, to make loading it as a csv
156        # file easier
157        txt = txt.replace("   ", ",")
158
159        # Convert the string into a file-like object
160        csv_file = io.StringIO(txt)
161
162        # Read the file-like object into a dataframe
163        cols = [
164            "lever_position",
165            "ship_speed",
166            "shaft_torque",
167            "turbine_revolution_rate",
168            "generator_revolution_rate",
169            "starboard_propeller_torque",
170            "port_propeller_torque",
171            "turbine_exit_temp",
172            "inlet_temp",
173            "outlet_temp",
174            "turbine_exit_pres",
175            "inlet_pres",
176            "outlet_pres",
177            "exhaust_pres",
178            "turbine_injection_control",
179            "fuel_flow",
180            "compressor_decay",
181            "turbine_decay",
182        ]
183        df = pd.read_csv(csv_file, header=None, names=cols)
184
185        return df
class GasTurbine(doubt.datasets.dataset.BaseDataset):
 17class GasTurbine(BaseDataset):
 18    __doc__ = f"""
 19    Data have been generated from a sophisticated simulator of a Gas Turbines (GT),
 20    mounted on a Frigate characterized by a COmbined Diesel eLectric And Gas (CODLAG)
 21    propulsion plant type.
 22
 23    The experiments have been carried out by means of a numerical simulator of a naval
 24    vessel (Frigate) characterized by a Gas Turbine (GT) propulsion plant. The
 25    different blocks forming the complete simulator (Propeller, Hull, GT, Gear Box and
 26    Controller) have been developed and fine tuned over the year on several similar
 27    real propulsion plants. In view of these observations the available data are in
 28    agreement with a possible real vessel.
 29
 30    In this release of the simulator it is also possible to take into account the
 31    performance decay over time of the GT components such as GT compressor and
 32    turbines.
 33
 34    The propulsion system behaviour has been described with this parameters:
 35
 36        - Ship speed (linear function of the lever position lp).
 37        - Compressor degradation coefficient kMc.
 38        - Turbine degradation coefficient kMt.
 39
 40    so that each possible degradation state can be described by a combination of this
 41    triple (lp,kMt,kMc).
 42
 43    The range of decay of compressor and turbine has been sampled with an uniform grid
 44    of precision 0.001 so to have a good granularity of representation.
 45
 46    In particular for the compressor decay state discretization the kMc coefficient has
 47    been investigated in the domain [1; 0.95], and the turbine coefficient in the
 48    domain [1; 0.975].
 49
 50    Ship speed has been investigated sampling the range of feasible speed from 3 knots
 51    to 27 knots with a granularity of representation equal to tree knots.
 52
 53    A series of measures (16 features) which indirectly represents of the state of the
 54    system subject to performance decay has been acquired and stored in the dataset
 55    over the parameter's space.
 56
 57    {BASE_DATASET_DESCRIPTION}
 58
 59    Features:
 60        lever_position (float)
 61            The position of the lever
 62        ship_speed (float):
 63            The ship speed, in knots
 64        shaft_torque (float):
 65            The shaft torque of the gas turbine, in kN m
 66        turbine_revolution_rate (float):
 67            The gas turbine rate of revolutions, in rpm
 68        generator_revolution_rate (float):
 69            The gas generator rate of revolutions, in rpm
 70        starboard_propeller_torque (float):
 71            The torque of the starboard propeller, in kN
 72        port_propeller_torque (float):
 73            The torque of the port propeller, in kN
 74        turbine_exit_temp (float):
 75            Height pressure turbine exit temperature, in celcius
 76        inlet_temp (float):
 77            Gas turbine compressor inlet air temperature, in celcius
 78        outlet_temp (float):
 79            Gas turbine compressor outlet air temperature, in celcius
 80        turbine_exit_pres (float):
 81            Height pressure turbine exit pressure, in bar
 82        inlet_pres (float):
 83            Gas turbine compressor inlet air pressure, in bar
 84        outlet_pres (float):
 85            Gas turbine compressor outlet air pressure, in bar
 86        exhaust_pres (float):
 87            Gas turbine exhaust gas pressure, in bar
 88        turbine_injection_control (float):
 89            Turbine injection control, in percent
 90        fuel_flow (float):
 91            Fuel flow, in kg/s
 92
 93    Targets:
 94        compressor_decay (type):
 95            Gas turbine compressor decay state coefficient
 96        turbine_decay (type):
 97            Gas turbine decay state coefficient
 98
 99    Source:
100        https://archive.ics.uci.edu/ml/datasets/Condition+Based+Maintenance+of+Naval+Propulsion+Plants
101
102    Examples:
103        Load in the data set::
104
105            >>> dataset = GasTurbine()
106            >>> dataset.shape
107            (11934, 18)
108
109        Split the data set into features and targets, as NumPy arrays::
110
111            >>> X, y = dataset.split()
112            >>> X.shape, y.shape
113            ((11934, 16), (11934, 2))
114
115        Perform a train/test split, also outputting NumPy arrays::
116
117            >>> train_test_split = dataset.split(test_size=0.2, random_seed=42)
118            >>> X_train, X_test, y_train, y_test = train_test_split
119            >>> X_train.shape, y_train.shape, X_test.shape, y_test.shape
120            ((9516, 16), (9516, 2), (2418, 16), (2418, 2))
121
122        Output the underlying Pandas DataFrame::
123
124            >>> df = dataset.to_pandas()
125            >>> type(df)
126            <class 'pandas.core.frame.DataFrame'>
127    """
128
129    _url = (
130        "https://archive.ics.uci.edu/ml/machine-learning-databases/"
131        "00316/UCI%20CBM%20Dataset.zip"
132    )
133
134    _features = range(16)
135    _targets = [16, 17]
136
137    def _prep_data(self, data: bytes) -> pd.DataFrame:
138        """Prepare the data set.
139
140        Args:
141            data (bytes): The raw data
142
143        Returns:
144            Pandas dataframe: The prepared data
145        """
146        # Convert the bytes into a file-like object
147        buffer = io.BytesIO(data)
148
149        # Unzip the file and pull out the txt file
150        with zipfile.ZipFile(buffer, "r") as zip_file:
151            txt_bytes = zip_file.read("UCI CBM Dataset/data.txt")
152
153        # Decode text and replace initial space on each line
154        txt = txt_bytes[3:].decode("utf-8").replace("\n   ", "\n")
155
156        # Convert the remaining triple spaces into commas, to make loading it as a csv
157        # file easier
158        txt = txt.replace("   ", ",")
159
160        # Convert the string into a file-like object
161        csv_file = io.StringIO(txt)
162
163        # Read the file-like object into a dataframe
164        cols = [
165            "lever_position",
166            "ship_speed",
167            "shaft_torque",
168            "turbine_revolution_rate",
169            "generator_revolution_rate",
170            "starboard_propeller_torque",
171            "port_propeller_torque",
172            "turbine_exit_temp",
173            "inlet_temp",
174            "outlet_temp",
175            "turbine_exit_pres",
176            "inlet_pres",
177            "outlet_pres",
178            "exhaust_pres",
179            "turbine_injection_control",
180            "fuel_flow",
181            "compressor_decay",
182            "turbine_decay",
183        ]
184        df = pd.read_csv(csv_file, header=None, names=cols)
185
186        return df

Data have been generated from a sophisticated simulator of a Gas Turbines (GT), mounted on a Frigate characterized by a COmbined Diesel eLectric And Gas (CODLAG) propulsion plant type.

The experiments have been carried out by means of a numerical simulator of a naval vessel (Frigate) characterized by a Gas Turbine (GT) propulsion plant. The different blocks forming the complete simulator (Propeller, Hull, GT, Gear Box and Controller) have been developed and fine tuned over the year on several similar real propulsion plants. In view of these observations the available data are in agreement with a possible real vessel.

In this release of the simulator it is also possible to take into account the performance decay over time of the GT components such as GT compressor and turbines.

The propulsion system behaviour has been described with this parameters:
  • Ship speed (linear function of the lever position lp).
  • Compressor degradation coefficient kMc.
  • Turbine degradation coefficient kMt.

so that each possible degradation state can be described by a combination of this triple (lp,kMt,kMc).

The range of decay of compressor and turbine has been sampled with an uniform grid of precision 0.001 so to have a good granularity of representation.

In particular for the compressor decay state discretization the kMc coefficient has been investigated in the domain [1; 0.95], and the turbine coefficient in the domain [1; 0.975].

Ship speed has been investigated sampling the range of feasible speed from 3 knots to 27 knots with a granularity of representation equal to tree knots.

A series of measures (16 features) which indirectly represents of the state of the system subject to performance decay has been acquired and stored in the dataset over the parameter's space.

Arguments:
  • cache (str or None, optional): The name of the cache. It will be saved to cache in the current working directory. If None then no cache will be saved. Defaults to '.dataset_cache'.
Attributes:
  • cache (str or None): The name of the cache.
  • shape (tuple of integers): Dimensions of the data set
  • columns (list of strings): List of column names in the data set
Features:

lever_position (float) The position of the lever ship_speed (float): The ship speed, in knots shaft_torque (float): The shaft torque of the gas turbine, in kN m turbine_revolution_rate (float): The gas turbine rate of revolutions, in rpm generator_revolution_rate (float): The gas generator rate of revolutions, in rpm starboard_propeller_torque (float): The torque of the starboard propeller, in kN port_propeller_torque (float): The torque of the port propeller, in kN turbine_exit_temp (float): Height pressure turbine exit temperature, in celcius inlet_temp (float): Gas turbine compressor inlet air temperature, in celcius outlet_temp (float): Gas turbine compressor outlet air temperature, in celcius turbine_exit_pres (float): Height pressure turbine exit pressure, in bar inlet_pres (float): Gas turbine compressor inlet air pressure, in bar outlet_pres (float): Gas turbine compressor outlet air pressure, in bar exhaust_pres (float): Gas turbine exhaust gas pressure, in bar turbine_injection_control (float): Turbine injection control, in percent fuel_flow (float): Fuel flow, in kg/s

Targets:

compressor_decay (type): Gas turbine compressor decay state coefficient turbine_decay (type): Gas turbine decay state coefficient

Source:

https://archive.ics.uci.edu/ml/datasets/Condition+Based+Maintenance+of+Naval+Propulsion+Plants

Examples:

Load in the data set::

>>> dataset = GasTurbine()
>>> dataset.shape
(11934, 18)

Split the data set into features and targets, as NumPy arrays::

>>> X, y = dataset.split()
>>> X.shape, y.shape
((11934, 16), (11934, 2))

Perform a train/test split, also outputting NumPy arrays::

>>> train_test_split = dataset.split(test_size=0.2, random_seed=42)
>>> X_train, X_test, y_train, y_test = train_test_split
>>> X_train.shape, y_train.shape, X_test.shape, y_test.shape
((9516, 16), (9516, 2), (2418, 16), (2418, 2))

Output the underlying Pandas DataFrame::

>>> df = dataset.to_pandas()
>>> type(df)
<class 'pandas.core.frame.DataFrame'>