Mask R-CNN 安装笔记

apt-get install docker.io

地址在  :https://github.com/NVIDIA/nvidia-docker

更新源巨慢。。。

直接下载安装 ,ubuntu 需要用 alien

wget  https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker-1.0.1-1.x86_64.rpm 

apt-get install alien
alien -i nvidia-docker-1.0.1-1.x86_64.rpm
nvidia-docker run caffe2ai/caffe2:c2v0.8.1.cuda8.cudnn7.ubuntu16.04

报错

docker: Error response from daemon: create nvidia_driver_387.26: create nvidia_driver_387.26: Error looking up volume plugin nvidia-docker: legacy plugin: plugin not found.

sudo nvidia-docker-plugin

nvidia-docker-plugin | 2018/01/25 18:30:47 Loading NVIDIA unified memory
nvidia-docker-plugin | 2018/01/25 18:30:47 Loading NVIDIA management library
nvidia-docker-plugin | 2018/01/25 18:30:47 Discovering GPU devices
nvidia-docker-plugin | 2018/01/25 18:30:48 Error: cuda: out of memory

sudo systemctl start nvidia-docker 报错

Jan 26 11:51:34 bj-s-19 systemd[1]: nvidia-docker.service: Main process exited, code=exited, status=217/USER
Jan 26 11:51:34 bj-s-19 systemd[1]: nvidia-docker.service: Control process exited, code=exited status=217
Jan 26 11:51:34 bj-s-19 systemd[1]: nvidia-docker.service: Control process exited, code=exited status=217

vim /usr/lib/systemd/system/nvidia-docker.service

把 USER (nvidia-docker) 换成 root

sudo systemctl start nvidia-docker   成功。

继续  nvidia-docker run caffe2ai/caffe2:c2v0.8.1.cuda8.cudnn7.ubuntu16.04

报错,感觉是caffe docker image 下错了

Detectron ops lib not found at '/usr/local/lib/libcaffe2_detectron_ops_gpu.so'; make sure that your Caffe2 version includes Detectron module

重新下

nvidia-docker pull caffe2ai/caffe2

nvidia-docker run  -d -it caffe2ai/caffe2  #退出 ctrl + p + q (pq按顺序点)
nvidia-docker run -it caffe2ai/caffe2:latest python -m caffe2.python.operator_test.relu_op_test  #测试新docker image

问题依旧,搜了很久也没搞清楚 libcaffe2_detectron_ops_gpu.so 这个东西是怎么来的。

============  重新开始的分割线 ==============

自己编一下 caffe2 试试。。。

准备环境

https://caffe2.ai/docs/getting-started.html?platform=ubuntu&configuration=compile

http://blog.csdn.net/zziahgf/article/details/79141879

git clone --recursive https://github.com/caffe2/caffe2.git && cd caffe2
cd docker/ubuntu-16.04-cuda8-cudnn6-all-options
sed -i -e 's/ --branch v0.8.1//g' Dockerfile 
docker build -t caffe2:cuda8-cudnn6-all-options .
cd $DETECTRON/docker
docker build -t detectron:c2-cuda8-cudnn6 .

run 成功。

nvidia-docker run --rm -it detectron:c2-cuda8-cudnn6 python2 tests/test_batch_permutation_op.py
E0131 17:08:40.230015     1 init_intrinsics_check.cc:54] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0131 17:08:40.230031     1 init_intrinsics_check.cc:54] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
E0131 17:08:40.230036     1 init_intrinsics_check.cc:54] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
..
----------------------------------------------------------------------
Ran 2 tests in 0.745s

OK
nvidia-docker run -d -itdetectron:c2-cuda8-cudnn6
nvidia-docker ps
nvidia-docker attach xxxx #上一步查出来的sha码。

python2 tools/infer_simple.py \
    --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml \
    --output-dir /tmp/detectron-visualizations \
    --image-ext jpg \
    --wts https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl \
    demo

测试效果。

15 thoughts on “Mask R-CNN 安装笔记”

      1. admin 你好
        –output-dir /home/ray/Detectron/demo/output \
        在这个文件架没有输出pdf文件。
        我也是用nvidia-docker. 你是如何看pdf的。

  1. 您好,请问提示out of memory Error from operator是什么原因呢?

    E0206 15:16:13.353634 10920 init_intrinsics_check.cc:54] CPU feature avx is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    E0206 15:16:13.353668 10920 init_intrinsics_check.cc:54] CPU feature avx2 is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    E0206 15:16:13.353675 10920 init_intrinsics_check.cc:54] CPU feature fma is present on your machine, but the Caffe2 binary is not compiled with it. It means you may not get the full speed of your CPU.
    WARNING cnn.py: 40: [====DEPRECATE WARNING====]: you are creating an object from CNNModelHelper class which will be deprecated soon. Please use ModelHelper object with brew module. For more information, please refer to caffe2.ai and python/brew.py, python/brew_test.py for more information.
    INFO net.py: 57: Loading weights from: /tmp/detectron-download-cache/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl
    I0206 15:16:18.946543 10920 net_dag_utils.cc:118] Operator graph pruning prior to chain compute took: 0.000200573 secs
    I0206 15:16:18.946779 10920 net_dag.cc:61] Number of parallel execution chains 63 Number of operators = 402
    I0206 15:16:18.967525 10920 net_dag_utils.cc:118] Operator graph pruning prior to chain compute took: 0.000145573 secs
    I0206 15:16:18.967732 10920 net_dag.cc:61] Number of parallel execution chains 30 Number of operators = 358
    I0206 15:16:18.969389 10920 net_dag_utils.cc:118] Operator graph pruning prior to chain compute took: 1.1131e-05 secs
    I0206 15:16:18.969425 10920 net_dag.cc:61] Number of parallel execution chains 5 Number of operators = 18
    INFO infer_simple.py: 111: Processing demo/24274813513_0cfd2ce6d0_k.jpg -> /tmp/detectron-visualizations/24274813513_0cfd2ce6d0_k.jpg.pdf
    terminate called after throwing an instance of ‘caffe2::EnforceNotMet’
    what(): [enforce fail at context_gpu.cu:343] error == cudaSuccess. 2 vs 0. Error at: /home/wn/caffe2/caffe2/core/context_gpu.cu:343: out of memory Error from operator:
    input: “gpu_0/res4_0_branch2b” input: “gpu_0/res4_0_branch2c_w” output: “gpu_0/res4_0_branch2c” name: “” type: “Conv” arg { name: “kernel” i: 1 } arg { name: “exhaustive_search” i: 0 } arg { name: “stride” i: 1 } arg { name: “pad” i: 0 } arg { name: “order” s: “NCHW” } arg { name: “dilation” i: 1 } device_option { device_type: 1 cuda_gpu_id: 0 } engine: “CUDNN”
    *** Aborted at 1517901380 (unix time) try “date -d @1517901380” if you are using GNU date ***
    PC: @ 0x7f2179ea3428 gsignal
    *** SIGABRT (@0x2aa8) received by PID 10920 (TID 0x7f2119f50700) from PID 10920; stack trace: ***
    @ 0x7f2179ea34b0 (unknown)
    @ 0x7f2179ea3428 gsignal
    @ 0x7f2179ea502a abort
    @ 0x7f217398f84d __gnu_cxx::__verbose_terminate_handler()
    @ 0x7f217398d6b6 (unknown)
    @ 0x7f217398d701 std::terminate()
    @ 0x7f21739b8d38 (unknown)
    @ 0x7f217a23f6ba start_thread
    @ 0x7f2179f7541d clone
    @ 0x0 (unknown)
    已放弃 (核心已转储)

      1. root@wn:/opt/detectron# nvidia-smi
        Tue Feb 6 16:43:54 2018
        +—————————————————————————–+
        | NVIDIA-SMI 384.111 Driver Version: 384.111 |
        |——————————-+———————-+———————-+
        | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
        | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
        |===============================+======================+======================|
        | 0 GeForce GT 720 Off | 00000000:01:00.0 N/A | N/A |
        | 43% 41C P0 N/A / N/A | 177MiB / 1998MiB | N/A Default |
        +——————————-+———————-+———————-+

        +—————————————————————————–+
        | Processes: GPU Memory |
        | GPU PID Type Process name Usage |
        |=============================================================================|
        | 0 Not Supported |
        +—————————————————————————–+

          1. 我降到1还是不行,然后我又重新编译了CPU版本的,报错:AssertionError: Detectron ops lib not found at ‘/home/wn/caffe2/build/lib/libcaffe2_detectron_ops_gpu.so’; make sure that your Caffe2 version includes Detectron module,我看您的文章里也提到这个错误。

          1. 不是先编译caffe2,然后再编译detectron么?是不是infer_simple.py这个例程只能是用GPU版本的跑呢?程序里面调用了有关GPU的库

发表评论

电子邮件地址不会被公开。