doubt.datasets.gas_turbine
Gas turbine data set.
This data set is from the UCI data set archive, with the description being the original description verbatim. Some feature names may have been altered, based on the description.
1"""Gas turbine data set. 2 3This data set is from the UCI data set archive, with the description being the original 4description verbatim. Some feature names may have been altered, based on the 5description. 6""" 7 8import io 9import zipfile 10 11import pandas as pd 12 13from .dataset import BASE_DATASET_DESCRIPTION, BaseDataset 14 15 16class GasTurbine(BaseDataset): 17 __doc__ = f""" 18 Data have been generated from a sophisticated simulator of a Gas Turbines (GT), 19 mounted on a Frigate characterized by a COmbined Diesel eLectric And Gas (CODLAG) 20 propulsion plant type. 21 22 The experiments have been carried out by means of a numerical simulator of a naval 23 vessel (Frigate) characterized by a Gas Turbine (GT) propulsion plant. The 24 different blocks forming the complete simulator (Propeller, Hull, GT, Gear Box and 25 Controller) have been developed and fine tuned over the year on several similar 26 real propulsion plants. In view of these observations the available data are in 27 agreement with a possible real vessel. 28 29 In this release of the simulator it is also possible to take into account the 30 performance decay over time of the GT components such as GT compressor and 31 turbines. 32 33 The propulsion system behaviour has been described with this parameters: 34 35 - Ship speed (linear function of the lever position lp). 36 - Compressor degradation coefficient kMc. 37 - Turbine degradation coefficient kMt. 38 39 so that each possible degradation state can be described by a combination of this 40 triple (lp,kMt,kMc). 41 42 The range of decay of compressor and turbine has been sampled with an uniform grid 43 of precision 0.001 so to have a good granularity of representation. 44 45 In particular for the compressor decay state discretization the kMc coefficient has 46 been investigated in the domain [1; 0.95], and the turbine coefficient in the 47 domain [1; 0.975]. 48 49 Ship speed has been investigated sampling the range of feasible speed from 3 knots 50 to 27 knots with a granularity of representation equal to tree knots. 51 52 A series of measures (16 features) which indirectly represents of the state of the 53 system subject to performance decay has been acquired and stored in the dataset 54 over the parameter's space. 55 56 {BASE_DATASET_DESCRIPTION} 57 58 Features: 59 lever_position (float) 60 The position of the lever 61 ship_speed (float): 62 The ship speed, in knots 63 shaft_torque (float): 64 The shaft torque of the gas turbine, in kN m 65 turbine_revolution_rate (float): 66 The gas turbine rate of revolutions, in rpm 67 generator_revolution_rate (float): 68 The gas generator rate of revolutions, in rpm 69 starboard_propeller_torque (float): 70 The torque of the starboard propeller, in kN 71 port_propeller_torque (float): 72 The torque of the port propeller, in kN 73 turbine_exit_temp (float): 74 Height pressure turbine exit temperature, in celcius 75 inlet_temp (float): 76 Gas turbine compressor inlet air temperature, in celcius 77 outlet_temp (float): 78 Gas turbine compressor outlet air temperature, in celcius 79 turbine_exit_pres (float): 80 Height pressure turbine exit pressure, in bar 81 inlet_pres (float): 82 Gas turbine compressor inlet air pressure, in bar 83 outlet_pres (float): 84 Gas turbine compressor outlet air pressure, in bar 85 exhaust_pres (float): 86 Gas turbine exhaust gas pressure, in bar 87 turbine_injection_control (float): 88 Turbine injection control, in percent 89 fuel_flow (float): 90 Fuel flow, in kg/s 91 92 Targets: 93 compressor_decay (type): 94 Gas turbine compressor decay state coefficient 95 turbine_decay (type): 96 Gas turbine decay state coefficient 97 98 Source: 99 https://archive.ics.uci.edu/ml/datasets/Condition+Based+Maintenance+of+Naval+Propulsion+Plants 100 101 Examples: 102 Load in the data set:: 103 104 >>> dataset = GasTurbine() 105 >>> dataset.shape 106 (11934, 18) 107 108 Split the data set into features and targets, as NumPy arrays:: 109 110 >>> X, y = dataset.split() 111 >>> X.shape, y.shape 112 ((11934, 16), (11934, 2)) 113 114 Perform a train/test split, also outputting NumPy arrays:: 115 116 >>> train_test_split = dataset.split(test_size=0.2, random_seed=42) 117 >>> X_train, X_test, y_train, y_test = train_test_split 118 >>> X_train.shape, y_train.shape, X_test.shape, y_test.shape 119 ((9516, 16), (9516, 2), (2418, 16), (2418, 2)) 120 121 Output the underlying Pandas DataFrame:: 122 123 >>> df = dataset.to_pandas() 124 >>> type(df) 125 <class 'pandas.core.frame.DataFrame'> 126 """ 127 128 _url = ( 129 "https://archive.ics.uci.edu/ml/machine-learning-databases/" 130 "00316/UCI%20CBM%20Dataset.zip" 131 ) 132 133 _features = range(16) 134 _targets = [16, 17] 135 136 def _prep_data(self, data: bytes) -> pd.DataFrame: 137 """Prepare the data set. 138 139 Args: 140 data (bytes): The raw data 141 142 Returns: 143 Pandas dataframe: The prepared data 144 """ 145 # Convert the bytes into a file-like object 146 buffer = io.BytesIO(data) 147 148 # Unzip the file and pull out the txt file 149 with zipfile.ZipFile(buffer, "r") as zip_file: 150 txt_bytes = zip_file.read("UCI CBM Dataset/data.txt") 151 152 # Decode text and replace initial space on each line 153 txt = txt_bytes[3:].decode("utf-8").replace("\n ", "\n") 154 155 # Convert the remaining triple spaces into commas, to make loading it as a csv 156 # file easier 157 txt = txt.replace(" ", ",") 158 159 # Convert the string into a file-like object 160 csv_file = io.StringIO(txt) 161 162 # Read the file-like object into a dataframe 163 cols = [ 164 "lever_position", 165 "ship_speed", 166 "shaft_torque", 167 "turbine_revolution_rate", 168 "generator_revolution_rate", 169 "starboard_propeller_torque", 170 "port_propeller_torque", 171 "turbine_exit_temp", 172 "inlet_temp", 173 "outlet_temp", 174 "turbine_exit_pres", 175 "inlet_pres", 176 "outlet_pres", 177 "exhaust_pres", 178 "turbine_injection_control", 179 "fuel_flow", 180 "compressor_decay", 181 "turbine_decay", 182 ] 183 df = pd.read_csv(csv_file, header=None, names=cols) 184 185 return df
17class GasTurbine(BaseDataset): 18 __doc__ = f""" 19 Data have been generated from a sophisticated simulator of a Gas Turbines (GT), 20 mounted on a Frigate characterized by a COmbined Diesel eLectric And Gas (CODLAG) 21 propulsion plant type. 22 23 The experiments have been carried out by means of a numerical simulator of a naval 24 vessel (Frigate) characterized by a Gas Turbine (GT) propulsion plant. The 25 different blocks forming the complete simulator (Propeller, Hull, GT, Gear Box and 26 Controller) have been developed and fine tuned over the year on several similar 27 real propulsion plants. In view of these observations the available data are in 28 agreement with a possible real vessel. 29 30 In this release of the simulator it is also possible to take into account the 31 performance decay over time of the GT components such as GT compressor and 32 turbines. 33 34 The propulsion system behaviour has been described with this parameters: 35 36 - Ship speed (linear function of the lever position lp). 37 - Compressor degradation coefficient kMc. 38 - Turbine degradation coefficient kMt. 39 40 so that each possible degradation state can be described by a combination of this 41 triple (lp,kMt,kMc). 42 43 The range of decay of compressor and turbine has been sampled with an uniform grid 44 of precision 0.001 so to have a good granularity of representation. 45 46 In particular for the compressor decay state discretization the kMc coefficient has 47 been investigated in the domain [1; 0.95], and the turbine coefficient in the 48 domain [1; 0.975]. 49 50 Ship speed has been investigated sampling the range of feasible speed from 3 knots 51 to 27 knots with a granularity of representation equal to tree knots. 52 53 A series of measures (16 features) which indirectly represents of the state of the 54 system subject to performance decay has been acquired and stored in the dataset 55 over the parameter's space. 56 57 {BASE_DATASET_DESCRIPTION} 58 59 Features: 60 lever_position (float) 61 The position of the lever 62 ship_speed (float): 63 The ship speed, in knots 64 shaft_torque (float): 65 The shaft torque of the gas turbine, in kN m 66 turbine_revolution_rate (float): 67 The gas turbine rate of revolutions, in rpm 68 generator_revolution_rate (float): 69 The gas generator rate of revolutions, in rpm 70 starboard_propeller_torque (float): 71 The torque of the starboard propeller, in kN 72 port_propeller_torque (float): 73 The torque of the port propeller, in kN 74 turbine_exit_temp (float): 75 Height pressure turbine exit temperature, in celcius 76 inlet_temp (float): 77 Gas turbine compressor inlet air temperature, in celcius 78 outlet_temp (float): 79 Gas turbine compressor outlet air temperature, in celcius 80 turbine_exit_pres (float): 81 Height pressure turbine exit pressure, in bar 82 inlet_pres (float): 83 Gas turbine compressor inlet air pressure, in bar 84 outlet_pres (float): 85 Gas turbine compressor outlet air pressure, in bar 86 exhaust_pres (float): 87 Gas turbine exhaust gas pressure, in bar 88 turbine_injection_control (float): 89 Turbine injection control, in percent 90 fuel_flow (float): 91 Fuel flow, in kg/s 92 93 Targets: 94 compressor_decay (type): 95 Gas turbine compressor decay state coefficient 96 turbine_decay (type): 97 Gas turbine decay state coefficient 98 99 Source: 100 https://archive.ics.uci.edu/ml/datasets/Condition+Based+Maintenance+of+Naval+Propulsion+Plants 101 102 Examples: 103 Load in the data set:: 104 105 >>> dataset = GasTurbine() 106 >>> dataset.shape 107 (11934, 18) 108 109 Split the data set into features and targets, as NumPy arrays:: 110 111 >>> X, y = dataset.split() 112 >>> X.shape, y.shape 113 ((11934, 16), (11934, 2)) 114 115 Perform a train/test split, also outputting NumPy arrays:: 116 117 >>> train_test_split = dataset.split(test_size=0.2, random_seed=42) 118 >>> X_train, X_test, y_train, y_test = train_test_split 119 >>> X_train.shape, y_train.shape, X_test.shape, y_test.shape 120 ((9516, 16), (9516, 2), (2418, 16), (2418, 2)) 121 122 Output the underlying Pandas DataFrame:: 123 124 >>> df = dataset.to_pandas() 125 >>> type(df) 126 <class 'pandas.core.frame.DataFrame'> 127 """ 128 129 _url = ( 130 "https://archive.ics.uci.edu/ml/machine-learning-databases/" 131 "00316/UCI%20CBM%20Dataset.zip" 132 ) 133 134 _features = range(16) 135 _targets = [16, 17] 136 137 def _prep_data(self, data: bytes) -> pd.DataFrame: 138 """Prepare the data set. 139 140 Args: 141 data (bytes): The raw data 142 143 Returns: 144 Pandas dataframe: The prepared data 145 """ 146 # Convert the bytes into a file-like object 147 buffer = io.BytesIO(data) 148 149 # Unzip the file and pull out the txt file 150 with zipfile.ZipFile(buffer, "r") as zip_file: 151 txt_bytes = zip_file.read("UCI CBM Dataset/data.txt") 152 153 # Decode text and replace initial space on each line 154 txt = txt_bytes[3:].decode("utf-8").replace("\n ", "\n") 155 156 # Convert the remaining triple spaces into commas, to make loading it as a csv 157 # file easier 158 txt = txt.replace(" ", ",") 159 160 # Convert the string into a file-like object 161 csv_file = io.StringIO(txt) 162 163 # Read the file-like object into a dataframe 164 cols = [ 165 "lever_position", 166 "ship_speed", 167 "shaft_torque", 168 "turbine_revolution_rate", 169 "generator_revolution_rate", 170 "starboard_propeller_torque", 171 "port_propeller_torque", 172 "turbine_exit_temp", 173 "inlet_temp", 174 "outlet_temp", 175 "turbine_exit_pres", 176 "inlet_pres", 177 "outlet_pres", 178 "exhaust_pres", 179 "turbine_injection_control", 180 "fuel_flow", 181 "compressor_decay", 182 "turbine_decay", 183 ] 184 df = pd.read_csv(csv_file, header=None, names=cols) 185 186 return df
Data have been generated from a sophisticated simulator of a Gas Turbines (GT), mounted on a Frigate characterized by a COmbined Diesel eLectric And Gas (CODLAG) propulsion plant type.
The experiments have been carried out by means of a numerical simulator of a naval vessel (Frigate) characterized by a Gas Turbine (GT) propulsion plant. The different blocks forming the complete simulator (Propeller, Hull, GT, Gear Box and Controller) have been developed and fine tuned over the year on several similar real propulsion plants. In view of these observations the available data are in agreement with a possible real vessel.
In this release of the simulator it is also possible to take into account the performance decay over time of the GT components such as GT compressor and turbines.
The propulsion system behaviour has been described with this parameters:
- Ship speed (linear function of the lever position lp).
- Compressor degradation coefficient kMc.
- Turbine degradation coefficient kMt.
so that each possible degradation state can be described by a combination of this triple (lp,kMt,kMc).
The range of decay of compressor and turbine has been sampled with an uniform grid of precision 0.001 so to have a good granularity of representation.
In particular for the compressor decay state discretization the kMc coefficient has been investigated in the domain [1; 0.95], and the turbine coefficient in the domain [1; 0.975].
Ship speed has been investigated sampling the range of feasible speed from 3 knots to 27 knots with a granularity of representation equal to tree knots.
A series of measures (16 features) which indirectly represents of the state of the system subject to performance decay has been acquired and stored in the dataset over the parameter's space.
Arguments:
- cache (str or None, optional): The name of the cache. It will be saved to
cache
in the current working directory. If None then no cache will be saved. Defaults to '.dataset_cache'.
Attributes:
- cache (str or None): The name of the cache.
- shape (tuple of integers): Dimensions of the data set
- columns (list of strings): List of column names in the data set
Features:
lever_position (float) The position of the lever ship_speed (float): The ship speed, in knots shaft_torque (float): The shaft torque of the gas turbine, in kN m turbine_revolution_rate (float): The gas turbine rate of revolutions, in rpm generator_revolution_rate (float): The gas generator rate of revolutions, in rpm starboard_propeller_torque (float): The torque of the starboard propeller, in kN port_propeller_torque (float): The torque of the port propeller, in kN turbine_exit_temp (float): Height pressure turbine exit temperature, in celcius inlet_temp (float): Gas turbine compressor inlet air temperature, in celcius outlet_temp (float): Gas turbine compressor outlet air temperature, in celcius turbine_exit_pres (float): Height pressure turbine exit pressure, in bar inlet_pres (float): Gas turbine compressor inlet air pressure, in bar outlet_pres (float): Gas turbine compressor outlet air pressure, in bar exhaust_pres (float): Gas turbine exhaust gas pressure, in bar turbine_injection_control (float): Turbine injection control, in percent fuel_flow (float): Fuel flow, in kg/s
Targets:
compressor_decay (type): Gas turbine compressor decay state coefficient turbine_decay (type): Gas turbine decay state coefficient
Source:
https://archive.ics.uci.edu/ml/datasets/Condition+Based+Maintenance+of+Naval+Propulsion+Plants
Examples:
Load in the data set::
>>> dataset = GasTurbine() >>> dataset.shape (11934, 18)
Split the data set into features and targets, as NumPy arrays::
>>> X, y = dataset.split() >>> X.shape, y.shape ((11934, 16), (11934, 2))
Perform a train/test split, also outputting NumPy arrays::
>>> train_test_split = dataset.split(test_size=0.2, random_seed=42) >>> X_train, X_test, y_train, y_test = train_test_split >>> X_train.shape, y_train.shape, X_test.shape, y_test.shape ((9516, 16), (9516, 2), (2418, 16), (2418, 2))
Output the underlying Pandas DataFrame::
>>> df = dataset.to_pandas() >>> type(df) <class 'pandas.core.frame.DataFrame'>