smdc_perftests.performance_tests package¶
Submodules¶
smdc_perftests.performance_tests.analyze module¶
Module for analyzing and the test results Created on Thu Apr 2 14:30:51 2015
@author: christoph.paulik@geo.tuwien.ac.at
-
smdc_perftests.performance_tests.analyze.
bar_plot
(df, show=True)[source]¶ Make a bar plot from the gathered results
Parameters: df: pandas.DataFrame
Measured data
show: boolean
if set then the plot is shown
Returns: ax: matplotlib.axes
axes of the plot
-
smdc_perftests.performance_tests.analyze.
prep_results
(results_files, name_fm=None, grouping_f=None)[source]¶ Takes a list of results file names and bundles the results into a pandas DataFrame
Parameters: results_files: list
list of filenames to load
name_fm: function, optional
if set a function that gets the name of the results and returns a more meaningful name. This is useful if the names of the results are very long or verbose.
grouping_f: function ,optional
can be used to assign groups according to the name of the results. Gets the name and returns a string.
Returns: df : pandas.DataFrame
Results named and possibly grouped
smdc_perftests.performance_tests.test_cases module¶
This module contains functions that run tests according to specifications from SMDC Performance comparison document.
Interfaces to data should be interchangeable as long as they adhere to interface specifications from rsdata module
Created on Tue Oct 21 13:37:58 2014
@author: christoph.paulik@geo.tuwien.ac.at
-
class
smdc_perftests.performance_tests.test_cases.
SelfTimingDataset
(ds, timefuncs=['get_timeseries', 'get_avg_image', 'get_data'])[source]¶ Bases:
object
Dataset class that times the functions of a dataset instance it gets in it’s constructor
Stores the results as TestResults instances in a dictionary with the timed function names as keys.
Methods
gentimedfunc
(funcname)generate a timed function that calls
-
class
smdc_perftests.performance_tests.test_cases.
TestResults
(init_obj, name=None, ddof=1)[source]¶ Bases:
object
Simple object that contains the test results and can be used to compare the test results to other test results.
Objects of this type can also be plotted by the plotting routines. Parameters ———- measured times or filename: list or string
list of measured times or netCDF4 file produced by to_nc of another TestResults object- ddof: int
- difference degrees of freedom. This is used to calculate standard deviation and variance. It is the number that is subtracted from the sample number n when estimating the population standard deviation and variance. see bessel’s correction on e.g. wikipedia for explanation
Attributes
median: float median of the measurements n: int sample size stdev: float standard deviation var: float variance total: float total time expired mean: float mean time per test run Methods
confidence_int
([conf_level])Calculate confidence interval of the mean to_nc
(filename)store results on disk as a netCDF4 file -
confidence_int
(conf_level=95)[source]¶ Calculate confidence interval of the mean time measured
Parameters: conf_level: float
confidence level desired for the confidence interval in percent. this will be transformed into the quantile needed to get the z value for the t distribution. default is 95% confidence interval
Returns: lower_mean : float
lower confidence interval boundary
mean : float
mean value
upper_mean : float
upper confidence interval boundary
-
smdc_perftests.performance_tests.test_cases.
measure
(exper_name, runs=5, ddof=1)[source]¶ Decorator that measures the running time of a function and calculates statistics.
Parameters: exper_name: string
experiment name, used for plotting and saving
runs: int
number of test runs to perform
ddof: int
difference degrees of freedom. This is used to calculate standard deviation and variance. It is the number that is subtracted from the sample number n when estimating the population standard deviation and variance. see bessel’s correction on e.g. wikipedia for explanation
Returns: results: dict
TestResults instance
-
smdc_perftests.performance_tests.test_cases.
read_rand_cells_by_cell_list
(dataset, cell_date_list, cell_id, read_perc=1.0, max_runtime=None)[source]¶ reads data from the dataset using the get_data method. In this method the start and end datetimes are fixed for all cell ID’s that are read.
Parameters: dataset: instance
instance of a class that implements a get_data(date_start, date_end, cell_id) method
date_start: datetime
start dates which should be read.
date_end: datetime
end dates which should be read.
cell_date_list: list of tuples, time intervals to read for each cell
cell_id: int or iterable
cell ids which should be read. can also be a list of integers
read_perc : float
percentage of cell ids to read from the
max_runtime: int, optional
maximum runtime of test in second.
-
smdc_perftests.performance_tests.test_cases.
read_rand_img_by_date_list
(dataset, date_list, read_perc=1.0, max_runtime=None, **kwargs)[source]¶ reads image data for random dates on a list additional kwargs are given to read_img method of dataset
Parameters: dataset: instance
instance of a class that implements a read_img(datetime) method
date_list: iterable
list of datetime objects
read_perc: float
percentage of datetimes out of date_list to read
max_runtime: int, optional
maximum runtime of test in second.
**kwargs:
other keywords are passed to the get_avg_image method dataset
-
smdc_perftests.performance_tests.test_cases.
read_rand_img_by_date_range
(dataset, date_list, read_perc=1.0, max_runtime=None, **kwargs)[source]¶ reads image data between random dates on a list additional kwargs are given to read_img method of dataset
Parameters: dataset: instance
instance of a class that implements a read_img(datetime) method
date_list: iterable
list of datetime objects The format is a list of lists e.g. [[datetime(2007,1,1), datetime(2007,1,1)], #reads one day
[datetime(2007,1,1), datetime(2007,12,31)]] # reads one year
read_perc: float
percentage of datetimes out of date_list to read
max_runtime: int, optional
maximum runtime of test in second.
**kwargs:
other keywords are passed to the get_avg_image method dataset
-
smdc_perftests.performance_tests.test_cases.
read_rand_ts_by_gpi_list
(dataset, gpi_list, read_perc=1.0, max_runtime=None, **kwargs)[source]¶ reads time series data for random grid point indices in a list additional kwargs are given to read_ts method of dataset
Parameters: dataset: instance
instance of a class that implements a read_ts(gpi) method
gpi_list: iterable
list or numpy array of grid point indices
read_perc: float
percentage of points from gpi_list to read
max_runtime: int, optional
maximum runtime of test in second.
**kwargs:
other keywords are passed to the get_timeseries method dataset
smdc_perftests.performance_tests.test_scripts module¶
Module implements the test cases specified in the performance test protocol Created on Wed Apr 1 10:59:05 2015
@author: christoph.paulik@geo.tuwien.ac.at
-
smdc_perftests.performance_tests.test_scripts.
run_ascat_tests
(dataset, testname, results_dir, n_dates=10000, date_read_perc=0.1, gpi_read_perc=0.1, repeats=3, cell_read_perc=10.0, max_runtime_per_test=None)[source]¶ Runs the ASCAT tests given a dataset instance
Parameters: dataset: Dataset instance
Instance of a Dataset class
testname: string
Name of the test, used for storing the results
results_dir: string
path where to store the test restults
n_dates: int, optional
number of dates to generate
date_read_perc: float, optioanl
percentage of random selection from date_range_list read for each try
gpi_read_perc: float, optional
percentage of random selection from gpi_list read for each try
repeats: int, optional
number of repeats of the tests
cell_list: list, optional
list of possible cells to read from. if given then the read_data test will be run
max_runtime_per_test: float, optional
maximum runtime per test in seconds, if given the tests will be aborted after taking more than this time
-
smdc_perftests.performance_tests.test_scripts.
run_equi7_tests
(dataset, testname, results_dir, n_dates=10000, date_read_perc=0.1, gpi_read_perc=0.1, repeats=3, cell_read_perc=100.0, max_runtime_per_test=None)[source]¶ Runs the ASAR/Sentinel 1 Equi7 tests given a dataset instance
Parameters: dataset: Dataset instance
Instance of a Dataset class
testname: string
Name of the test, used for storing the results
results_dir: string
path where to store the test restults
n_dates: int, optional
number of dates to generate
date_read_perc: float, optioanl
percentage of random selection from date_range_list read for each try
gpi_read_perc: float, optional
percentage of random selection from gpi_list read for each try
repeats: int, optional
number of repeats of the tests
cell_list: list, optional
list of possible cells to read from. if given then the read_data test will be run
max_runtime_per_test: float, optional
maximum runtime per test in seconds, if given the tests will be aborted after taking more than this time
-
smdc_perftests.performance_tests.test_scripts.
run_esa_cci_netcdf_tests
(test_dir, results_dir, variables=['sm'])[source]¶ function for running the ESA CCI netCDF performance tests the tests will be run for all .nc files in the test_dir
Parameters: test_dir: string
path to the test files
results_dir: string
path in which the results should be stored
variables: list
list of variables to read for the tests
-
smdc_perftests.performance_tests.test_scripts.
run_esa_cci_tests
(dataset, testname, results_dir, n_dates=10000, date_read_perc=0.1, gpi_read_perc=0.1, repeats=3, cell_read_perc=10.0, max_runtime_per_test=None)[source]¶ Runs the ESA CCI tests given a dataset instance
Parameters: dataset: Dataset instance
Instance of a Dataset class
testname: string
Name of the test, used for storing the results
results_dir: string
path where to store the test restults
n_dates: int, optional
number of dates to generate
date_read_perc: float, optioanl
percentage of random selection from date_range_list read for each try
gpi_read_perc: float, optional
percentage of random selection from gpi_list read for each try
repeats: int, optional
number of repeats of the tests
cell_list: list, optional
list of possible cells to read from. if given then the read_data test will be run
max_runtime_per_test: float, optional
maximum runtime per test in seconds, if given the tests will be aborted after taking more than this time
-
smdc_perftests.performance_tests.test_scripts.
run_performance_tests
(name, dataset, save_dir, gpi_list=None, date_range_list=None, cell_list=None, cell_date_list=None, gpi_read_perc=1.0, date_read_perc=1.0, cell_read_perc=1.0, max_runtime_per_test=None, repeats=1)[source]¶ Run a complete test suite on a dataset and store the results in the specified directory
Parameters: name: string
name of the test run, used for filenaming
dataset: dataset instance
instance implementing the get_timeseries, get_avg_image and get_data methods.
save_dir: string
directory to store the test results in
gpi_list: list, optional
list of possible grid point indices, if given the timeseries reading tests will be run
date_range_list: list, optional
list of possible dates, if given then the read_avg_image and read_data tests will be run. The format is a list of lists e.g. [[datetime(2007,1,1), datetime(2007,1,1)], #reads one day
[datetime(2007,1,1), datetime(2007,12,31)]] # reads one year
cell_list: list, optional
list of possible cells to read from. if given then the read_data test will be run
cell_date_list: list, optional
list of time intervals to read per cell. Should be as long as the cell list or longer.
gpi_read_perc: float, optional
percentage of random selection from gpi_list read for each try
date_read_perc: float, optioanl
percentage of random selection from date_range_list read for each try
cell_read_perc: float, optioanl
percentage of random selection from cell_range_list read for each try
max_runtime_per_test: float, optional
maximum runtime per test in seconds, if given the tests will be aborted after taking more than this time
repeats: int, optional
number of repeats for each measurement