Mask R-CNN 安装笔记

apt-get install docker.io

地址在  :https://github.com/NVIDIA/nvidia-docker

更新源巨慢。。。

直接下载安装 ,ubuntu 需要用 alien

wget  https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker-1.0.1-1.x86_64.rpm 

apt-get install alien
alien -i nvidia-docker-1.0.1-1.x86_64.rpm
nvidia-docker run caffe2ai/caffe2:c2v0.8.1.cuda8.cudnn7.ubuntu16.04

报错

docker: Error response from daemon: create nvidia_driver_387.26: create nvidia_driver_387.26: Error looking up volume plugin nvidia-docker: legacy plugin: plugin not found.

sudo nvidia-docker-plugin

nvidia-docker-plugin | 2018/01/25 18:30:47 Loading NVIDIA unified memory
nvidia-docker-plugin | 2018/01/25 18:30:47 Loading NVIDIA management library
nvidia-docker-plugin | 2018/01/25 18:30:47 Discovering GPU devices
nvidia-docker-plugin | 2018/01/25 18:30:48 Error: cuda: out of memory

sudo systemctl start nvidia-docker 报错

Jan 26 11:51:34 bj-s-19 systemd[1]: nvidia-docker.service: Main process exited, code=exited, status=217/USER
Jan 26 11:51:34 bj-s-19 systemd[1]: nvidia-docker.service: Control process exited, code=exited status=217
Jan 26 11:51:34 bj-s-19 systemd[1]: nvidia-docker.service: Control process exited, code=exited status=217

vim /usr/lib/systemd/system/nvidia-docker.service

把 USER (nvidia-docker) 换成 root

sudo systemctl start nvidia-docker   成功。

继续  nvidia-docker run caffe2ai/caffe2:c2v0.8.1.cuda8.cudnn7.ubuntu16.04

报错,感觉是caffe docker image 下错了

Detectron ops lib not found at '/usr/local/lib/libcaffe2_detectron_ops_gpu.so'; make sure that your Caffe2 version includes Detectron module

重新下

nvidia-docker pull caffe2ai/caffe2

nvidia-docker run  -d -it caffe2ai/caffe2  #退出 ctrl + p + q (pq按顺序点)
nvidia-docker run -it caffe2ai/caffe2:latest python -m caffe2.python.operator_test.relu_op_test  #测试新docker image

问题依旧,搜了很久也没搞清楚 libcaffe2_detectron_ops_gpu.so 这个东西是怎么来的。

============  重新开始的分割线 ==============

自己编一下 caffe2 试试。。。

准备环境

https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile

http://blog.csdn.net/zziahgf/article/details/79141879

git clone --recursive https://github.com/caffe2/caffe2.git && cd caffe2
cd docker/ubuntu-16.04-cuda8-cudnn6-all-options
sed -i -e 's/ --branch v0.8.1//g' Dockerfile 
docker build -t caffe2:cuda8-cudnn6-all-options .
cd $DETECTRON/docker
docker build -t detectron:c2-cuda8-cudnn6 .

run 成功。

nvidia-docker run --rm -it detectron:c2-cuda8-cudnn6 python2 tests/test_batch_permutation_op.py
E0131 17:08:40.230015     1 init_intrinsics_check.cc:54] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0131 17:08:40.230031     1 init_intrinsics_check.cc:54] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0131 17:08:40.230036     1 init_intrinsics_check.cc:54] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
..
----------------------------------------------------------------------
Ran 2 tests in 0.745s

OK
nvidia-docker run -d -itdetectron:c2-cuda8-cudnn6
nvidia-docker ps
nvidia-docker attach xxxx #上一步查出来的sha码。

python2 tools/infer_simple.py \
    --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
    --output-dir /tmp/detectron-visualizations \
    --image-ext jpg \
    --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
    demo

测试效果。

视频物体分割–One-Shot Video Object Segmentation

git clone git@github.com:fperazzi/davis-2017.git

./configure.sh && make -C build/release

pip install virtualenv virtualenvwrapper
source /usr/local/bin/virtualenvwrapper.sh
mkvirtualenv davis
pip install -r python/requirements.txt
export PYTHONPATH=$(pwd)/python/lib

example

cd python/experiments
python read_write_segmentation.py

报错

Traceback (most recent call last):
  File "read_write_segmentation.py", line 20, in <module>
    from   davis import cfg,io,DAVISLoader
  File "/root/game/davis-2017/python/lib/davis/__init__.py", line 13, in <module>
    from misc import log     # Logger
ImportError: No module named 'misc'

./python/lib/davis/misc/config.py

./python/lib/davis/__init__.py

有一些文件引用错了,改之。。。又报错

[WARNING][11-01-2018 17:27:26] Temporal stability not available
Traceback (most recent call last):
  File "./python/experiments/read_write_segmentation.py", line 23, in <module>
    db = DAVISLoader(year=cfg.YEAR,phase=cfg.PHASE)
  File "/root/game/davis-2017/python/lib/davis/dataset/loader.py", line 73, in __init__
    for s in self._db_sequences]
  File "/root/game/davis-2017/python/lib/davis/dataset/base.py", line 97, in __init__
    osp.join(cfg.PATH.SEQUENCES,name),regex)
  File "/root/game/davis-2017/python/lib/davis/dataset/base.py", line 78, in __init__
    self.name,len(self),cfg.SEQUENCES[self.name].num_frames))
Exception: Incorrect frames number for sequence 'bike-packing': found 0, expected 69.

原因是找不到文件

到data 目录下执行两个 sh 文件 1G 左右大小的文件。

get_davis_results.sh
get_davis.sh
python python/tools/visualize.py

按住空格或者回车键。

tensorflow 版本:https://github.com/scaelles/OSVOS-TensorFlow

机器找不到 libcudnn.so.6

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import *
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 72, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

.bashrc 中加入

export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH

srez – super-resolution

https://github.com/david-gpu/srez

python3 srez_main.py --run demo.

报错

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/imageio/plugins/ffmpeg.py", line 82, in get_exe
    auto=False)
  File "/usr/local/lib/python3.5/dist-packages/imageio/core/fetching.py", line 102, in get_remote_file
    raise NeedDownloadError()
imageio.core.fetching.NeedDownloadError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "srez_main.py", line 1, in <module>
    import srez_demo
  File "/root/srez/srez_demo.py", line 1, in <module>
    import moviepy.editor as mpe
  File "/usr/local/lib/python3.5/dist-packages/moviepy/editor.py", line 22, in <module>
    from .video.io.VideoFileClip import VideoFileClip
  File "/usr/local/lib/python3.5/dist-packages/moviepy/video/io/VideoFileClip.py", line 3, in <module>
    from moviepy.video.VideoClip import VideoClip
  File "/usr/local/lib/python3.5/dist-packages/moviepy/video/VideoClip.py", line 21, in <module>
    from .io.ffmpeg_writer import ffmpeg_write_image, ffmpeg_write_video
  File "/usr/local/lib/python3.5/dist-packages/moviepy/video/io/ffmpeg_writer.py", line 11, in <module>
    from moviepy.config import get_setting
  File "/usr/local/lib/python3.5/dist-packages/moviepy/config.py", line 35, in <module>
    FFMPEG_BINARY = get_exe()
  File "/usr/local/lib/python3.5/dist-packages/imageio/plugins/ffmpeg.py", line 86, in get_exe
    raise NeedDownloadError('Need ffmpeg exe. '
imageio.core.fetching.NeedDownloadError: Need ffmpeg exe. You can download it by calling:
  imageio.plugins.ffmpeg.download()

修改

srez_demo.py

最上面加两行

import imageio
imageio.plugins.ffmpeg.download()

新建 checkpoint 文件夹,重新运行。

报错

Traceback (most recent call last):
 File "srez_main.py", line 190, in <module>
 tf.app.run()
 File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 48, in run
 _sys.exit(main(_sys.argv[:1] + flags_passthrough))
 File "srez_main.py", line 185, in main
 _demo()
 File "srez_main.py", line 113, in _demo
 sess, summary_writer = setup_tensorflow()
 File "srez_main.py", line 103, in setup_tensorflow
 summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, sess.graph)
AttributeError: module 'tensorflow.python.training.training' has no attribute 'SummaryWriter'

升级 tensorflow-gpu
/usr/bin/pip3 install –upgrade tensorflow-gpu

又报错

ImportError: libcudnn.so.6: cannot open shared object file: No such file or directory

参考 这篇文章

 

MAgent 安装调试笔记

git clone git@github.com:geek-ai/MAgent.git
cd MAgent

brew install cmake llvm boost
brew install jsoncpp argp-standalone
brew tap david-icracked/homebrew-websocketpp
brew install --HEAD david-icracked/websocketpp/websocketpp

bash build.sh
export PYTHONPATH=$(pwd)/python:$PYTHONPATH

build 的时候报错

make[2]: /usr/local/opt/llvm/bin/clang++: No such file or directory/usr/local/opt/llvm/bin/clang++: No such file or directory

 

Efficient Deep Learning for Stereo Matching

一个基于 Siamese 结构的  Optical Flow 解决方案。

web:http://www.cs.toronto.edu/deepLowLevelVision/

GIT : https://bitbucket.org/saakuraa/cvpr16_stereo_public

闲言:《第一本无人驾驶技术书》第四章提到  Urtasun 教授的一个牛逼解决方案。0.3秒搞定一组双目视觉的深度图。找了下实现的代码,在 GitHub 找了好久,原来在  BBK上。

简单测试

th inference_match_subimg.lua -g 0 --model split_win37_dep9 --data_version kitti2015 --data_root pretrain/kitti2015/sample_img --model_param pretrain/kitti2015/param.t7 --bn_meanstd pretrain/kitti2015/bn_meanstd.t7 --saveDir outImg --start_id 1 --n 1

提前讲文件转为 png格式放在  pretrain/kitti2015/sample_img/image_2 & image_3

命名格式为  00000{num}_10.png   并改 –n 参数为 2

效果。速度比 pyflow 快了不少

Pyfolw的测试,测试原图也在这里

pix2code GUI转换为 iOS,android,web 页面源码

GITHUB:  https://github.com/tonybeltramelli/pix2code

操作流程见github,非常详细。

一路无话,在公司用 1070训,报

python3 ./train.py ../datasets/web/training_set ../bin
Using TensorFlow backend.
Loading data...
Generating sparse vectors...
Dataset size: 143741
Vocabulary size: 19
Input shape: (256, 256, 3)
Output size: 19
Convert arrays...
Traceback (most recent call last):
 File "./train.py", line 66, in <module>
 run(input_path, output_path, is_memory_intensive=use_generator, pretrained_model=pretrained_weigths)
 File "./train.py", line 24, in run
 dataset.convert_arrays()
 File "/root/code/pix2code/model/classes/dataset/Dataset.py", line 82, in convert_arrays
 self.input_images = np.array(self.input_images)
MemoryError

nvidia-smi 了下,发现有个进程占了2G显存。kill 之,重试还报一样的错误。

重新看了下 github 上的 readme

解决方案,用下面这个语句。也是官方推荐的train方式,之前没仔细看。

python3 ./train.py ../datasets/web/training_set ../bin 1

开始非常愉快的训练了。。。1080ti 大概 18分钟1轮,默认设定为 10轮。

一夜无话。。。。到早上终于训练好了

截了一张博客的图

生成 GUI

python3 ./sample.py ../bin pix2code  1.jpg  ../code

生成的GUI,似乎生成的是它演示的哪个页面。不是我提供的图。。奇怪。重新训一下试试。

header{
btn-inactive,btn-active
}
row{
single{
small-title,text,btn-green
}
}
row{
quadruple{
small-title,text,btn-orange
}
quadruple{
small-title,text,btn-orange
}
quadruple{
small-title,text,btn-orange
}
quadruple{
small-title,text,btn-orange
}
}
row{
double{
small-title,text,btn-orange
}
double{
small-title,text,btn-orange
}
}

忘了加  python3 ,默认用 python2 训完了报错

Traceback (most recent call last):
  File "./train.py", line 66, in <module>
    run(input_path, output_path, is_memory_intensive=use_generator, pretrained_m
odel=pretrained_weigths)
  File "./train.py", line 51, in run
    model.fit_generator(generator, steps_per_epoch=steps_per_epoch)
  File "/home/endler/code/pix2code/model/classes/model/pix2code.py", line 70, in
 fit_generator
    self.save()
  File "/home/endler/code/pix2code/model/classes/model/AModel.py", line 18, in s
ave
    self.model.save_weights("{}/{}.h5".format(self.output_path, self.name))
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 2
580, in save_weights
    raise ImportError('`save_weights` requires h5py.')
ImportError: `save_weights` requires h5py.

哭死,python2.7 没装  h5py。。。还得重训一次。。。。

gogogo,我就不信了。。。

又报这个,不过 权重文件已经生成了。

Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7f3f6f67ae80>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 696, in __del__
TypeError: 'NoneType' object is not callable

反复试验了几次,貌似它只能识别 demo 哪几张图。ㄟ( ▔, ▔ )ㄏ

PyFlow Optical Flow 的 python 实现

pip3 install Cython // sudo apt-get install cython3
pip3 install fasttext
pip3 install Pillow
git clone https://github.com/pathak22/pyflow.git
cd pyflow/
python3 setup.py build_ext -i
python3 demo.py    # -viz option to visualize output

修改 demo,把  car1.jpg ,car2.jpg  修改成自己的图片。
用 iphone 在黑板下沿平移拍的测试。请忽略我家混乱的客厅。/(ㄒoㄒ)/~~

Age Gender Estimate TF 一个tensorflow 识别年龄的demo

GITHUB:https://github.com//BoyuanJiang/Age-Gender-Estimate-TF

clone 代码。

下载数据集 到  ~/data

wget   https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/static/imdb_crop.tar

加压缩

tar -xvf imdb_crop.tar

数据集预处理

python convert_to_records_multiCPU.py --imdb --nworks 8

训练的时候发现输出的文件为空。

try 了一下

example = tf.train.Example(features=tf.train.Features(feature={
 # 'height': _int64_feature(rows),
 # 'width': _int64_feature(cols),
 # 'depth': _int64_feature(depth),
 'age': _int64_feature(int(ages[index])),
 'gender': _int64_feature(int(genders[index])),
 'image_raw': _bytes_feature(image_raw),
 'file_name': _bytes_feature(str(file_name[index][0]))}))

报错

’40/nm1102140_rm3713850624_1974-7-29_2013.jpg’ has type <class ‘str’>, but expected one of: ((<class ‘bytes’>,),)

去掉

'file_name': _bytes_feature(str(file_name[index][0]))

程序可以正常跑了
训练的时候报错

Invalid argument: Name: <unknown>, Feature: file_name (data type: st
ring) is required but could not be found.

修改 file_name 行如下

'file_name': _bytes_feature(bytes(file_name[index][0],'utf-8'))}))

重新生成。。。

或者换成

python convert_to_records.py –imdb

就好了。

下载models:https://pan.baidu.com/s/1dFewgqH

训练好的model

https://pan.baidu.com/s/1bpllJg7

Train

python3  train.py --lr 1e-3 --weight_decay 1e-5 --epoch 6 --batch_size 128 --keep_prob 0.8 --cuda

Test 选出最好的模型

python3 test.py --images "./data/test" --model_path "./models" --batch_size 128 --choose_best --cuda
Age_MAE:7.07,Gender_Acc:80.92%,Age_model:./models/model.ckpt-12001,Gender_model:./models/model.ckpt-9001

用自己的图片测试

python3 eval.py --I "./demo/demo.jpg" --M "./models/" --font_scale 1 --thickness 1

 

另一个Keras 的实现: Keras 实现的性别年龄检测