Notice
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | |||||
3 | 4 | 5 | 6 | 7 | 8 | 9 |
10 | 11 | 12 | 13 | 14 | 15 | 16 |
17 | 18 | 19 | 20 | 21 | 22 | 23 |
24 | 25 | 26 | 27 | 28 | 29 | 30 |
Tags
- python3
- dataset
- install
- python
- object detection
- face_recognition
- TensorFlow
- colab
- YouTube 8M
- download
- urllib
- windows
- linux
- ubuntu
- Jupyter notebook
- raspberry pi
- pyTorch
- shakeratos
- FIle
- CUDA
- colaboratory
- ppc64le
- error
- keras
- Windows 10
- Deep Learning
- dlib
- gpu memory
- 딥러닝
- Anaconda
Archives
- Today
- Total
Shakerato
YouTube-8M dataset file renaming for avoiding duplicate file names in Windows (Python code) 본문
Research
YouTube-8M dataset file renaming for avoiding duplicate file names in Windows (Python code)
Shakeratto 2018. 6. 6. 21:36## To avoid duplicate file names problem in Windows (e.g. Ab.txt, aB.txt),
## At first, I download the dataset and renamed the file names in the Linux OS.
## then, I copy renamed files to my Windows desktop for training deep neural network.
## Renamed dataset also works well on the YouTube-8M training code.
import os
import shutil
oldpath = '/data/yt8m/v2/video/'
newpath = '/data/yt8m/v2/video2/'
file_list = os.listdir(oldpath)
file_train_cnt = 0
file_test_cnt = 0
file_validate_cnt = 0
new_file_name = ''
for filename in file_list:
old_file_path = os.path.join(oldpath, filename)
if 'tra' in filename[:3] and '.tfrecord' in filename:
new_file_name = 'train' + str(file_train_cnt).zfill(5) + '_' + \
str(filename).split('.')[0] + '.tfrecord'
file_train_cnt += 1
elif 'tes' in filename[:3] and '.tfrecord' in filename:
new_file_name = 'test' + str(file_test_cnt).zfill(5) + '_' + \
str(filename).split('.')[0] + '.tfrecord'
file_test_cnt += 1
elif 'val' in filename[:3] and '.tfrecord' in filename:
new_file_name = 'validate' + str(file_validate_cnt).zfill(5) + \
'_' + str(filename).split('.')[0] + '.tfrecord'
file_validate_cnt += 1
else:
continue
new_file_path = os.path.join(newpath, new_file_name)
with open(old_file_path, 'rb') as f1, open(new_file_path, 'wb') as f2:
f2.write(f1.read())
print('Successfuly Copied! [', filename, ' to ', new_file_path, ']')
print(file_train_cnt)
print(file_test_cnt)
print(file_validate_cnt)
'Research' 카테고리의 다른 글
Install PyTorch using Anaconda on Windows 10 (0) | 2018.06.10 |
---|---|
YouTube8M simplest model to train (0) | 2018.06.09 |
YouTube 8M dataset download (0) | 2018.06.01 |
Python code to download file on the internet using urllib library (python 3) (0) | 2018.06.01 |
pip install -r requirements.txt 라이브러리 설치 팁 (버전 상관없이 최신버전 설치) (0) | 2018.05.26 |
Comments