1.set configuration path in pycharm
control + command + C
2.named tuple:
a group of constant variables, better way to substitute in a function.
example:
from collections import named tuple
RangeParam = namedtuple('RangeParam', ['min_latitude', 'max_latitude','min_longitude','max_longitude','grid_size’])
PARAM = RangeParam(1.22,1.47,103.6,104.05,0.01)
3.use the following to test whether a file works
if name == 'main':
parser = argparse.ArgumentParser(description='Add GRID_ID to cell_df')
parser.add_argument(
'-i', '--input_dir',
help='Directory containing cell_df.csv',
required=True)
parser.add_argument(
'-o', '--out_dir',
help='The output directory for the generated adding GRID_ID cell_df',
required=True)
args = parser.parse_args()
data_dir = args.input_dir # this should contain date folders, within each should contain groundtruth both "data" and "label"
cell_path = os.path.join(data_dir, 'CELLULAR_LOCATION.csv')
output_dir = args.out_dir
cell_df = pd.read_csv(cell_path)
cell_out_df = add_grid_id(cell_df)
cell_out_df.to_csv(os.path.join(output_dir, 'CELLULAR_LOCATION_edited.csv’))
os.path.join(output_dir, cell_filename)
different systems may have different symbols for path, \ or /, with .join() function, they can be changed into the same format directly.
4.Python functions
(1).def foo(*a):
def foo(a, b)
print(a)
def foo(*a)
print(a)
a = [1,2,3,4]
with a *, the input parameter can be a flexible length.
def foo(*a, **b)
print(a)
print(b)
this a is flexible length, b is specific parameter, b is also optional.
foo([1,2,3,4], x = 12, y = 14)
result:
[1,2,3,4]
x = 12
y = 14
(2).more flexibility in input parameters:
def add_grid_id(cell_df, param, lat_col='LATITUDE', lon_col='LONGITUDE', grid_col='GRID_ID'):
"""
Add grid id for each cellular based on its latitude and longitude location.
:param cell_df: cellular location data frame, which contains cell's latitude and longitude information.
:param param: parameters about the bounded box, which contains Singapore on map.
:param lat_col: latitude column.
:param lon_col: longitude column.
:param grid_col: columns to add in grid id.
:return: cell_df with grid_col added that contains the grid id for the corresponding lat and lon.
"""
cell_df[grid_col] = cell_df.apply(lambda r: to_grid(r[lon_col], r[lat_col], param), axis=1)
return cell_df
In this function, only two columns in cell_df are used, latitude column and longitude
column, so it can be written in this format to make it more flexible.
(3).function with exception
def(a,b):
try:
c = a/b
except:
c = 0
return c
in this case, we don’t need to consider about all the invalid scenarios, if functions don’t work in first case, it will go to except condition.
5.How to check a function in Pycharm?
select the function’s name, press option+command+ ‘>’(right arrow)
if you want to go back, press option+command+ ‘<’(left arrow)
6.ssh yguo@epsilon-compute1
conda env list
source activate go_jek_yw
source deactivate