Video Action Classification using 3D ResNet
Reference:
https://github.com/kenshohara/video-classification-3d-cnn-pytorch
(Linux)
1. Install Anaconda and Setup Pytorch (GPU)
2. Download code from Github (https://github.com/kenshohara/video-classification-3d-cnn-pytorch)
3. Download Pre-trained Model and copy to source code folder (resnext-101-kinetics.pth)
https://drive.google.com/drive/folders/1zvl89AgFAApbH0At-gMuZSeQB_LpNP-M
conda install -c anaconda ffmpeg
(Windows) -- Not Recommend
1. Install Anaconda and Setup Pytorch (GPU)
2. Download code from Github (https://github.com/kenshohara/video-classification-3d-cnn-pytorch)
3. Download Pre-trained Model and copy to source code folder (resnext-101-kinetics.pth)
https://drive.google.com/drive/folders/1zvl89AgFAApbH0At-gMuZSeQB_LpNP-M
4. Download FFmpeg and copy to source code folder
5. Change code (subprocess.call() is for removing and creating folders in linux)
(shutil is need for windows) import shutil subprocess.call('rm -rf tmp', shell=True) --> (Change like this) if os.path.exists('tmp'): shutil.rmtree('tmp') subprocess.call('mkdir tmp', shell=True) --> (Change like this) os.makedirs('tmp') |
6. Run a code
python main.py --input ./input --video_root ./videos --output ./output.json --model ./resnext-101-kinetics.pth --resnet_shortcut B --mode score --model_name resnext --model_depth 101
7. You can get result file (output.json) "segment": [start frame number, end frame number], "label": " "