机器学习 – 开始学

【论文查重】pycorrector 搭建错别字识别

先更新 wheel ，否则会报下面这个错

error: can’t copy ‘pycorrector/data/kenlm’: doesn’t exist or not a regular file

pip install –upgrade wheel
pip install –upgrade setuptools

pip install pypinyin
pip install kenlm
pip install numpy
pip install jieba
pip install pycorrector

kenlm 在centos下安装可能会出现找不到 Python.h 的错误

解决办法 yum install python36u-devel

python 文件内容

#coding=utf-8
import pycorrector
corrected_sent,detail=pycorrector.correct(‘少先队员因该为老人让坐’)
print(corrected_sent, detail)

执行结果

2019-03-08 10:10:32,814 – /usr/local/lib/python3.5/dist-packages/pycorrector/corrector.py – DEBUG – Loaded same pinyin file: /usr/local/lib/python3.5/dist-packages/pycorrector/data/same_pinyin.txt, same stroke file: /usr/local/lib/python3.5/dist-packages/pycorrector/data/same_stroke.txt, spend: 0.047 s.
2019-03-08 10:10:32,816 – /usr/local/lib/python3.5/dist-packages/pycorrector/detector.py – DEBUG – Loaded language model: /usr/local/lib/python3.5/dist-packages/pycorrector/data/kenlm/people_chars_lm.klm, spend: 0.0011556148529052734 s
2019-03-08 10:10:34,098 – /usr/local/lib/python3.5/dist-packages/pycorrector/detector.py – DEBUG – Loaded word freq file: /usr/local/lib/python3.5/dist-packages/pycorrector/data/word_dict.txt, spend: 1.2822625637054443 s
2019-03-08 10:10:34,099 – /usr/local/lib/python3.5/dist-packages/pycorrector/detector.py – DEBUG – Loaded confusion file: /usr/local/lib/python3.5/dist-packages/pycorrector/data/custom_confusion.txt, spend: 1.2832543849945068 s
少先队员应该为老人让座 [[‘因该’, ‘应该’, 4, 6], [‘坐’, ‘座’, 10, 11]]

微信小程序之快速创建朋友圈分享海报

// pages/share/pic/sharepic.js
var app = getApp();
Page({

  /**
   * 页面的初始数据
   */
  data: {
    jdConfig: {
      width: 750,
      height: 1334,
      backgroundColor: '#fff',
      debug: false,
      blocks: [
        {
          width: 690,
          height: 808,
          x: 30,
          y: 183,
          borderWidth: 2,
          borderColor: '#f0c2a0',
          borderRadius: 20,
        },
        {
          width: 634,
          height: 74,
          x: 59,
          y: 770,
          backgroundColor: '#fff',
          opacity: 0.5,
          zIndex: 100,
        },
      ],
      texts: [
        {
          x: 113,
          y: 61,
          baseLine: 'middle',
          text: '蟠桃',
          fontSize: 32,
          color: '#8d8d8d',
        },
        {
          x: 30,
          y: 113,
          baseLine: 'top',
          text: '发现一个好物，推荐给你呀',
          fontSize: 38,
          color: '#080808',
        },
        {
          x: 92,
          y: 810,
          fontSize: 38,
          baseLine: 'middle',
          text: '标题标题标题标题标题标题标题标题标题',
          width: 570,
          lineNum: 1,
          color: '#8d8d8d',
          zIndex: 200,
        },
        {
          x: 59,
          y: 895,
          baseLine: 'middle',
          text: [
            {
              text: '2人拼',
              fontSize: 28,
              color: '#ec1731',
            },
            {
              text: '¥99',
              fontSize: 36,
              color: '#ec1731',
              marginLeft: 30,
            }
          ]
        },
        {
          x: 522,
          y: 895,
          baseLine: 'middle',
          text: '已拼2件',
          fontSize: 28,
          color: '#929292',
        },
        {
          x: 59,
          y: 945,
          baseLine: 'middle',
          text: [
            {
              text: '商家发货&售后',
              fontSize: 28,
              color: '#929292',
            },
            {
              text: '七天退货',
              fontSize: 28,
              color: '#929292',
              marginLeft: 50,
            },
            {
              text: '运费险',
              fontSize: 28,
              color: '#929292',
              marginLeft: 50,
            },
          ]
        },
        {
          x: 360,
          y: 1065,
          baseLine: 'top',
          text: '长按识别小程序码',
          fontSize: 38,
          color: '#080808',
        },
        {
          x: 360,
          y: 1123,
          baseLine: 'top',
          text: '超值好货一起拼',
          fontSize: 28,
          color: '#929292',
        },
      ],
      images: [
        {
          width: 62,
          height: 62,
          x: 30,
          y: 30,
          borderRadius: 62,
          url: 'https://www.yibaifen.com/images/02bb99132352b5b5dcea.jpg',
        },
        {
          width: 634,
          height: 634,
          x: 59,
          y: 210,
          url: 'https://www.yibaifen.com/images/193256f45999757701f2.jpeg',
        },
        {
          width: 220,
          height: 220,
          x: 92,
          y: 1020,
          url: 'https://www.yibaifen.com/images/d719fdb289c955627735.jpg',
        },
        {
          width: 750,
          height: 90,
          x: 0,
          y: 1244,
          url: 'https://www.yibaifen.com/images/67b0a8ad316b44841c69.png',
        }
      ]

    },
    demoConfig: {
      width: 750,
      height: 1000,
      backgroundColor: '#fff',
      debug: false,
      blocks: [
        {
          x: 0,
          y: 10,
          width: 750, // 如果内部有文字，由文字宽度和内边距决定
          height: 120,
          paddingLeft: 0,
          paddingRight: 0,
          borderWidth: 10,
          borderColor: 'red',
          backgroundColor: 'blue',
          borderRadius: 40,
          text: {
            text: [
              {
                text: '金额¥ 1.00',
                fontSize: 80,
                color: 'yellow',
                opacity: 1,
                marginLeft: 50,
                marginRight: 10,
              },
              {
                text: '金额¥ 1.00',
                fontSize: 20,
                color: 'yellow',
                opacity: 1,
                marginLeft: 10,
                textDecoration: 'line-through',
              },
            ],
            baseLine: 'middle',
          },
        }
      ],
      texts: [
        {
          x: 0,
          y: 180,
          text: [
            {
              text: '长标题长标题长标题长标题长标题长标题长标题长标题长标题',
              fontSize: 40,
              color: 'red',
              opacity: 1,
              marginLeft: 0,
              marginRight: 10,
              width: 200,
              lineHeight: 40,
              lineNum: 2,
            },
            {
              text: '原价¥ 1.00',
              fontSize: 40,
              color: 'blue',
              opacity: 1,
              marginLeft: 10,
              textDecoration: 'line-through',
            },
          ],
          baseLine: 'middle',
        },
        {
          x: 10,
          y: 330,
          text: '金额¥ 1.00',
          fontSize: 80,
          color: 'blue',
          opacity: 1,
          baseLine: 'middle',
          textDecoration: 'line-through',
        },
      ],
      images: [
        {
          url: 'https://www.yibaifen.com/images/02bb99132352b5b5dcea.jpg',
          width: 300,
          height: 300,
          y: 450,
          x: 0,
          // borderRadius: 150,
          // borderWidth: 10,
          // borderColor: 'red',
        },
        {
          url: 'https://www.yibaifen.com/images/02bb99132352b5b5dcea.jpg',
          width: 100,
          height: 100,
          y: 450,
          x: 400,
          borderRadius: 100,
          borderWidth: 10,
        },
      ],
      lines: [
        {
          startY: 800,
          startX: 10,
          endX: 300,
          endY: 800,
          width: 5,
          color: 'red',
        }
      ]

    }
  },
  onPosterSuccess(e) {
    const { detail } = e;
    wx.previewImage({
      current: detail,
      urls: [detail]
    })
  },
  onPosterFail(err) {
    console.error(err);
  },
  /**
   * 生命周期函数--监听页面加载
   */
  onLoad: function (options) {
   
  },

  /**
   * 生命周期函数--监听页面初次渲染完成
   */
  onReady: function () {

  },

  /**
   * 生命周期函数--监听页面显示
   */
  onShow: function () {

  },

  /**
   * 生命周期函数--监听页面隐藏
   */
  onHide: function () {

  },

  /**
   * 生命周期函数--监听页面卸载
   */
  onUnload: function () {

  },

  /**
   * 页面相关事件处理函数--监听用户下拉动作
   */
  onPullDownRefresh: function () {

  },

  /**
   * 页面上拉触底事件的处理函数
   */
  onReachBottom: function () {

  },

  /**
   * 用户点击右上角分享
   */
  onShareAppMessage: function () {

  }
})

创业了！

很久没更博客了。。。

因为去创业了！创业干的事没多少技术含量。

无非是小程序，app一类的。没啥值得写的。

教育相关！给教育机构导流吧。

有想聊聊的朋友欢迎加微信：endpang

Keras 实现的性别年龄检测（已并入颜值服务）

github https://github.com/yu4u/age-gender-estimation

先试用下训练好的权重文件

将 demo.py 的80行改动了一下，用以识别图片

if 1==1:
    #for img in yield_images():
        img = cv2.imread("test.jpg")

113行改动用以保存结果

        cv2.imwrite("test1.jpg",img)
        #cv2.imshow("result", img)
        #key = cv2.waitKey(30)
        #while true:
        #    if key == 27:
        #        break

效果还可以，等有空训练下再改成一个service。原权重的训练集基本都是老外。

另一个Tensorflow的实现：Age Gender Estimate TF 一个tensorflow 识别年龄的demo

整合进了颜值server

代码比较乱，凑活先用着吧，

全部代码请移步 Github https://github.com/endpang/xindong/blob/master/facerank/server.py

权重文件超过了 github100m的限制。大家去源地址下载吧。

https://github.com/yu4u/age-gender-estimation/releases/download/v0.5/weights.18-4.06.hdf5

__author__ = 'pangzhiwei'

import cv2
import dlib
import numpy as np
import math
import itertools
from sklearn.externals import joblib
from sklearn import decomposition
import bottle
from bottle import request
import urllib.request
import json
from wide_resnet import WideResNet


def facialRatio(points):
	x1 = points[0]
	y1 = points[1]
	x2 = points[2]
	y2 = points[3]
	x3 = points[4]
	y3 = points[5]
	x4 = points[6]
	y4 = points[7]
	dist1 = math.sqrt((x1-x2)**2 + (y1-y2)**2)
	dist2 = math.sqrt((x3-x4)**2 + (y3-y4)**2)
	ratio = dist1/dist2
	return ratio


def generateFeatures(pointIndices1, pointIndices2, pointIndices3, pointIndices4, allLandmarkCoordinates):
	size = allLandmarkCoordinates.shape
	if len(size) > 1:
		allFeatures = np.zeros((size[0], len(pointIndices1)))
		for x in range(0, size[0]):
			landmarkCoordinates = allLandmarkCoordinates[x, :]
			ratios = []
			for i in range(0, len(pointIndices1)):
				x1 = landmarkCoordinates[2*(pointIndices1[i]-1)]
				y1 = landmarkCoordinates[2*pointIndices1[i] - 1]
				x2 = landmarkCoordinates[2*(pointIndices2[i]-1)]
				y2 = landmarkCoordinates[2*pointIndices2[i] - 1]
				x3 = landmarkCoordinates[2*(pointIndices3[i]-1)]
				y3 = landmarkCoordinates[2*pointIndices3[i] - 1]
				x4 = landmarkCoordinates[2*(pointIndices4[i]-1)]
				y4 = landmarkCoordinates[2*pointIndices4[i] - 1]
				points = [x1, y1, x2, y2, x3, y3, x4, y4]
				ratios.append(facialRatio(points))
			allFeatures[x, :] = np.asarray(ratios)
	else:
		allFeatures = np.zeros((1, len(pointIndices1)))
		landmarkCoordinates = allLandmarkCoordinates
		ratios = []
		for i in range(0, len(pointIndices1)):
			x1 = landmarkCoordinates[2*(pointIndices1[i]-1)]
			y1 = landmarkCoordinates[2*pointIndices1[i] - 1]
			x2 = landmarkCoordinates[2*(pointIndices2[i]-1)]
			y2 = landmarkCoordinates[2*pointIndices2[i] - 1]
			x3 = landmarkCoordinates[2*(pointIndices3[i]-1)]
			y3 = landmarkCoordinates[2*pointIndices3[i] - 1]
			x4 = landmarkCoordinates[2*(pointIndices4[i]-1)]
			y4 = landmarkCoordinates[2*pointIndices4[i] - 1]
			points = [x1, y1, x2, y2, x3, y3, x4, y4]
			ratios.append(facialRatio(points))
		allFeatures[0, :] = np.asarray(ratios)
	return allFeatures


def generateAllFeatures(allLandmarkCoordinates):
	a = [18, 22, 23, 27, 37, 40, 43, 46, 28, 32, 34, 36, 5, 9, 13, 49, 55, 52, 58]
	combinations = itertools.combinations(a, 4)
	i = 0
	pointIndices1 = []
	pointIndices2 = []
	pointIndices3 = []
	pointIndices4 = []
	for combination in combinations:
		pointIndices1.append(combination[0])
		pointIndices2.append(combination[1])
		pointIndices3.append(combination[2])
		pointIndices4.append(combination[3])
		i = i+1
		pointIndices1.append(combination[0])
		pointIndices2.append(combination[2])
		pointIndices3.append(combination[1])
		pointIndices4.append(combination[3])
		i = i+1
		pointIndices1.append(combination[0])
		pointIndices2.append(combination[3])
		pointIndices3.append(combination[1])
		pointIndices4.append(combination[2])
		i = i+1
	return generateFeatures(pointIndices1, pointIndices2, pointIndices3, pointIndices4, allLandmarkCoordinates)


def fetch_face_pic(face,predictor):
    rects = detector(face, 1)
    img_h, img_w, _ = np.shape(face)
    faces = np.empty((len(rects), img_size, img_size, 3))
    detected = rects
    img = face
    lables = []
    if len(detected) > 0:
        for i, d in enumerate(detected):
            x1, y1, x2, y2, w, h = d.left(), d.top(), d.right() + 1, d.bottom() + 1, d.width(), d.height()
            xw1 = max(int(x1 - 0.4 * w), 0)
            yw1 = max(int(y1 - 0.4 * h), 0)
            xw2 = min(int(x2 + 0.4 * w), img_w - 1)
            yw2 = min(int(y2 + 0.4 * h), img_h - 1)
            cv2.rectangle(img, (x1, y1), (x2, y2), (255, 0, 0), 2)
            # cv2.rectangle(img, (xw1, yw1), (xw2, yw2), (255, 0, 0), 2)
            faces[i, :, :, :] = cv2.resize(img[yw1:yw2 + 1, xw1:xw2 + 1, :], (img_size, img_size))

        # predict ages and genders of the detected faces
        results = model.predict(faces)
        predicted_genders = results[0]
        ages = np.arange(0, 101).reshape(101, 1)
        predicted_ages = results[1].dot(ages).flatten()

        # draw results
        for i, d in enumerate(detected):
            label = [predicted_ages[i],
                                    "F" if predicted_genders[i][0] > 0.5 else "M"]
            #print(label)
            lables.append(label)
            #draw_label(img, (d.left(), d.top()), label)
    arrs = []
    face_arr = []
    for faces in range(len(rects)):
        # 使用predictor进行人脸关键点识别
        #print(rects[faces])
        landmarks = np.matrix([[p.x, p.y] for p in predictor(face, rects[faces]).parts()])
        #face_img = face.copy()
        # 使用enumerate函数遍历序列中的元素以及它们的下标
        arr = []

        for idx, point in enumerate(landmarks):
            arr = np.append(arr,point[0,0])
            arr = np.append(arr,point[0,1])
            #strs += str(point[0, 0]) + ','  + str(point[0, 1]) + ','
            #pos = (point[0, 0], point[0, 1])
            #print(point)
            #f.write(str(point[0, 0]))
            #f.write(',')
            #f.write(str(point[0, 1]))
            #f.write(',')
            #f.write('\n')
        if len(arrs) == 0:
            arrs = [arr]
        else:
            arrs = np.concatenate((arrs,[arr]),axis=0)
        f = rects[faces]
        [x1,x2,y1,y2]=[f.left(),f.right(),f.top(),f.bottom()]
        a = [[x1,x2,y1,y2]]
        if len(face_arr) == 0:
            face_arr = a
        else:
            face_arr = np.concatenate((face_arr,a) ,axis=0)
    return arrs,face_arr,lables

def predict(my_features):
    predictions = []
    for i in range(len(my_features)):
        feature = my_features[i, :]
        feature_transfer = pca.transform(feature.reshape(1, -1))
        predictions.append(pre_model.predict(feature_transfer).tolist())
        print(i)
    '''
    if len(my_features.shape) > 1:
        for i in range(len(my_features)):
            feature = my_features[i, :]
            feature_transfer = pca.transform(feature.reshape(1, -1))
            predictions.append(pre_model.predict(feature_transfer))
        print('照片中的人颜值得分依次为(满分为5分)：')
        k = 1
        for pre in predictions:
            print('第%d个人：' % k, end='')
            print(str(pre)+'分')
            k += 1
    else:
        feature = my_features
        feature_transfer = pca.transform(feature.reshape(1, -1))
        predictions.append(pre_model.predict(feature_transfer))
        print('照片中的人颜值得分为(满分为5分)：')
        k = 1
        for pre in predictions:
            print(str(pre)+'分')
            k += 1
    '''
    return predictions

PREDICTOR_PATH = './model/shape_predictor_68_face_landmarks.dat'
detector = dlib.get_frontal_face_detector()
# 使用官方提供的模型构建特征提取器
predictor = dlib.shape_predictor(PREDICTOR_PATH)
pre_model = joblib.load('./model/face_rating.pkl')
features = np.loadtxt('./data/features_ALL.txt', delimiter=',')
pca = decomposition.PCA(n_components=20)
pca.fit(features)

weight_file = "./model/weights.18-4.06.hdf5"
img_size = 64
model = WideResNet(img_size, depth=16, k=8)()
model.load_weights(weight_file)

@bottle.route('/find', method='GET')
def do_find():
    w = request.query.get("url")
    #print(w)
    resp = urllib.request.urlopen(w)
    image = np.asarray(bytearray(resp.read()),dtype="uint8")
    image = cv2.imdecode(image,cv2.IMREAD_COLOR)

    arrs,faces,lables = fetch_face_pic(image,predictor)
    print("arrs:",arrs)
    if len(arrs) < 1:
        return ""
    if len(arrs) == 1:
        my_features = generateAllFeatures(arrs[0])
    else:
        my_features = generateAllFeatures(arrs)
    if len(my_features.shape) > 1:
        predictions = predict(my_features,)
        print(faces)
        print(predictions)
        # print(type(predictions))
        print(type(faces))
        a2 = np.array([1,2])
        if type(faces) == type(a2):
            print("is")
            faces = faces.tolist()
        result =[
            faces,predictions,image.shape,lables
        ]
    #print(image)
    print(faces)
    return json.dumps(result)


bottle.run(host='0.0.0.0', port=8888)

效果

测试图片

虚拟货币（比特币，以太坊）价值预测

缘由：好友开发了一个程序化交易虚拟货币的助手软件。卖的火热（想购买的朋友请加微信： endpang ）。出于对土豪的敬意，抽时间做一个时序预测的东西。

财富自由之路开始。。。。

首先得获得实时的价格数据，

火币网的接口：https://github.com/huobiapi/API_Docs/wiki/REST_introduction

websocket 获得实时数据并将价格写入文本文件

from websocket import create_connection
import gzip
import time
import json




if __name__ == '__main__':
    while(1):
        try:
            ws = create_connection("wss://api.huobipro.com/ws")
            break
        except:
            print('connect ws error,retry...')
            time.sleep(5)

    # 订阅 KLine 数据
    tradeStr="""{"sub": "market.ethusdt.kline.1min","id": "id10"}"""

    # 请求 KLine 数据
    # tradeStr="""{"req": "market.ethusdt.kline.1min","id": "id10", "from": 1513391453, "to": 1513392453}"""

    #订阅 Market Depth 数据
    # tradeStr="""{"sub": "market.ethusdt.depth.step5", "id": "id10"}"""

    #请求 Market Depth 数据
    # tradeStr="""{"req": "market.ethusdt.depth.step5", "id": "id10"}"""

    #订阅 Trade Detail 数据
    # tradeStr="""{"sub": "market.ethusdt.trade.detail", "id": "id10"}"""

    #请求 Trade Detail 数据
    # tradeStr="""{"req": "market.ethusdt.trade.detail", "id": "id10"}"""

    #请求 Market Detail 数据
    # tradeStr="""{"req": "market.ethusdt.detail", "id": "id12"}"""

    ws.send(tradeStr)
    old = {"vol": 0,"count":0,"close":0}
    i = 0
    with open('test_csv.csv',"w") as csv:
        #csv.write("lines\n")
        while(1):

            compressData=ws.recv()
            result=gzip.decompress(compressData).decode('utf-8')
            if result[:7] == '{"ping"':

                ts=result[8:21]
                pong='{"pong":'+ts+'}'
                ws.send(pong)
                ws.send(tradeStr)
            elif result[:5] == '{"ch"':
                arr = json.loads(result)
                #print("arr",arr)

                if arr['tick']['count'] < old["count"]:
                    i = i+1
                    csv.write(format(old['close'])+"\n")
                    print('count',old)
                else:
                    delta = arr['tick']['vol'] - old['vol']
                    print("delta:",delta)
                    if delta > 0:
                        deltap = arr['tick']['close'] -  old["close"]

                        print("deltap",deltap)
                old = arr['tick']

基于tensorflow lstm 的预测模型。

import lstm
import time
import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt

def plot_results(predicted_data, true_data):
    fig = plt.figure(facecolor='white')
    ax = fig.add_subplot(111)
    ax.plot(true_data, label='True Data')
    plt.plot(predicted_data, label='Prediction')
    plt.legend()
    plt.show()

def plot_results_multiple(predicted_data, true_data, prediction_len):
    fig = plt.figure(facecolor='white')
    ax = fig.add_subplot(111)
    ax.plot(true_data, label='True Data')
    #Pad the list of predictions to shift it in the graph to it's correct start
    for i, data in enumerate(predicted_data):
        padding = [None for p in range(i * prediction_len)]
        plt.plot(padding + data, label='Prediction')
        plt.legend()
    plt.show()

#Main Run Thread
if __name__=='__main__':
	global_start_time = time.time()
	epochs  = 1
	seq_len = 50

	print('> Loading data... ')

	X_train, y_train, X_test, y_test = lstm.load_data('test_csv.csv', seq_len, True)

	print('> Data Loaded. Compiling...')

	model = lstm.build_model([1, 50, 100, 1])

	model.fit(
	    X_train,
	    y_train,
	    batch_size=512,
	    nb_epoch=epochs,
	    validation_split=0.05)

	predictions = lstm.predict_sequences_multiple(model, X_test, seq_len, 50)
	#predicted = lstm.predict_sequence_full(model, X_test, seq_len)
	#predicted = lstm.predict_point_by_point(model, X_test)        

	print('Training duration (s) : ', time.time() - global_start_time)
	plot_results_multiple(predictions, y_test, 50)

市场有风险，交易需谨慎。不准莫怪我，预测准的，写出来也不会开放代码的。。。 : P

Face Rank 基于dlib的颜值计算服务

带年龄的版本 Keras 实现的性别年龄检测

本文不再更新，请移步上面链接的文章。

__author__ = 'pangzhiwei'

import cv2
import dlib
import numpy as np
import math
import itertools
from sklearn.externals import joblib
from sklearn import decomposition
import bottle
from bottle import request
import urllib.request
import json


def facialRatio(points):
	x1 = points[0]
	y1 = points[1]
	x2 = points[2]
	y2 = points[3]
	x3 = points[4]
	y3 = points[5]
	x4 = points[6]
	y4 = points[7]
	dist1 = math.sqrt((x1-x2)**2 + (y1-y2)**2)
	dist2 = math.sqrt((x3-x4)**2 + (y3-y4)**2)
	ratio = dist1/dist2
	return ratio


def generateFeatures(pointIndices1, pointIndices2, pointIndices3, pointIndices4, allLandmarkCoordinates):
	size = allLandmarkCoordinates.shape
	if len(size) > 1:
		allFeatures = np.zeros((size[0], len(pointIndices1)))
		for x in range(0, size[0]):
			landmarkCoordinates = allLandmarkCoordinates[x, :]
			ratios = []
			for i in range(0, len(pointIndices1)):
				x1 = landmarkCoordinates[2*(pointIndices1[i]-1)]
				y1 = landmarkCoordinates[2*pointIndices1[i] - 1]
				x2 = landmarkCoordinates[2*(pointIndices2[i]-1)]
				y2 = landmarkCoordinates[2*pointIndices2[i] - 1]
				x3 = landmarkCoordinates[2*(pointIndices3[i]-1)]
				y3 = landmarkCoordinates[2*pointIndices3[i] - 1]
				x4 = landmarkCoordinates[2*(pointIndices4[i]-1)]
				y4 = landmarkCoordinates[2*pointIndices4[i] - 1]
				points = [x1, y1, x2, y2, x3, y3, x4, y4]
				ratios.append(facialRatio(points))
			allFeatures[x, :] = np.asarray(ratios)
	else:
		allFeatures = np.zeros((1, len(pointIndices1)))
		landmarkCoordinates = allLandmarkCoordinates
		ratios = []
		for i in range(0, len(pointIndices1)):
			x1 = landmarkCoordinates[2*(pointIndices1[i]-1)]
			y1 = landmarkCoordinates[2*pointIndices1[i] - 1]
			x2 = landmarkCoordinates[2*(pointIndices2[i]-1)]
			y2 = landmarkCoordinates[2*pointIndices2[i] - 1]
			x3 = landmarkCoordinates[2*(pointIndices3[i]-1)]
			y3 = landmarkCoordinates[2*pointIndices3[i] - 1]
			x4 = landmarkCoordinates[2*(pointIndices4[i]-1)]
			y4 = landmarkCoordinates[2*pointIndices4[i] - 1]
			points = [x1, y1, x2, y2, x3, y3, x4, y4]
			ratios.append(facialRatio(points))
		allFeatures[0, :] = np.asarray(ratios)
	return allFeatures


def generateAllFeatures(allLandmarkCoordinates):
	a = [18, 22, 23, 27, 37, 40, 43, 46, 28, 32, 34, 36, 5, 9, 13, 49, 55, 52, 58]
	combinations = itertools.combinations(a, 4)
	i = 0
	pointIndices1 = []
	pointIndices2 = []
	pointIndices3 = []
	pointIndices4 = []
	for combination in combinations:
		pointIndices1.append(combination[0])
		pointIndices2.append(combination[1])
		pointIndices3.append(combination[2])
		pointIndices4.append(combination[3])
		i = i+1
		pointIndices1.append(combination[0])
		pointIndices2.append(combination[2])
		pointIndices3.append(combination[1])
		pointIndices4.append(combination[3])
		i = i+1
		pointIndices1.append(combination[0])
		pointIndices2.append(combination[3])
		pointIndices3.append(combination[1])
		pointIndices4.append(combination[2])
		i = i+1
	return generateFeatures(pointIndices1, pointIndices2, pointIndices3, pointIndices4, allLandmarkCoordinates)


def fetch_face_pic(face,predictor):
    rects = detector(face, 1)
    #str = ""
    #strs = ""
    arrs = []
    face_arr = []
    for faces in range(len(rects)):
        # 使用predictor进行人脸关键点识别
        #print(rects[faces])
        landmarks = np.matrix([[p.x, p.y] for p in predictor(face, rects[faces]).parts()])
        #face_img = face.copy()
        # 使用enumerate函数遍历序列中的元素以及它们的下标
        arr = []

        for idx, point in enumerate(landmarks):
            arr = np.append(arr,point[0,0])
            arr = np.append(arr,point[0,1])
            #strs += str(point[0, 0]) + ','  + str(point[0, 1]) + ','
            #pos = (point[0, 0], point[0, 1])
            #print(point)
            #f.write(str(point[0, 0]))
            #f.write(',')
            #f.write(str(point[0, 1]))
            #f.write(',')
            #f.write('\n')
        if len(arrs) == 0:
            arrs = [arr]
        else:
            arrs = np.concatenate((arrs,[arr]),axis=0)
        f = rects[faces]
        [x1,x2,y1,y2]=[f.left(),f.right(),f.top(),f.bottom()]
        a = [[x1,x2,y1,y2]]
        if len(face_arr) == 0:
            face_arr = a
        else:
            face_arr = np.concatenate((face_arr,a) ,axis=0)
    return arrs,face_arr

def predict(my_features):
    predictions = []
    for i in range(len(my_features)):
        feature = my_features[i, :]
        feature_transfer = pca.transform(feature.reshape(1, -1))
        predictions.append(pre_model.predict(feature_transfer).tolist())
        print(i)
    '''
    if len(my_features.shape) > 1:
        for i in range(len(my_features)):
            feature = my_features[i, :]
            feature_transfer = pca.transform(feature.reshape(1, -1))
            predictions.append(pre_model.predict(feature_transfer))
        print('照片中的人颜值得分依次为(满分为5分)：')
        k = 1
        for pre in predictions:
            print('第%d个人：' % k, end='')
            print(str(pre)+'分')
            k += 1
    else:
        feature = my_features
        feature_transfer = pca.transform(feature.reshape(1, -1))
        predictions.append(pre_model.predict(feature_transfer))
        print('照片中的人颜值得分为(满分为5分)：')
        k = 1
        for pre in predictions:
            print(str(pre)+'分')
            k += 1
    '''
    return predictions

PREDICTOR_PATH = './model/shape_predictor_68_face_landmarks.dat'
detector = dlib.get_frontal_face_detector()
# 使用官方提供的模型构建特征提取器
predictor = dlib.shape_predictor(PREDICTOR_PATH)
pre_model = joblib.load('./model/face_rating.pkl')
features = np.loadtxt('./data/features_ALL.txt', delimiter=',')
pca = decomposition.PCA(n_components=20)
pca.fit(features)


@bottle.route('/find', method='GET')
def do_find():
    w = request.query.get("url")
    #print(w)
    resp = urllib.request.urlopen(w)
    image = np.asarray(bytearray(resp.read()),dtype="uint8")
    image = cv2.imdecode(image,cv2.IMREAD_COLOR)
    arrs,faces = fetch_face_pic(image,predictor)
    print(arrs)
    my_features = generateAllFeatures(arrs)
    if len(my_features.shape) > 1:
        predictions = predict(my_features,)
        print(faces)
        print(predictions)
        # print(type(predictions))
        result =[
            faces.tolist(),predictions
        ]
    #print(image)
    print(faces)
    return json.dumps(result)


bottle.run(host='0.0.0.0', port=8888)

最新代码及model 文件见 https://github.com/endpang/xindong

pip3.x 报错处理

Traceback (most recent call last):
 File "/usr/local/bin/pip3.5", line 7, in <module>
 from pip import main
 File "/usr/local/lib/python3.5/dist-packages/pip/__init__.py", line 26, in <module>
 from pip.utils import get_installed_distributions, get_prog
 File "/usr/local/lib/python3.5/dist-packages/pip/utils/__init__.py", line 27, in <module>
 from pip._vendor import pkg_resources
 File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 3018, in <module>
 @_call_aside
 File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 3004, in _call_aside
 f(*args, **kwargs)
 File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 3046, in _initialize_master_working_set
 dist.activate(replace=False)
 File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 2578, in activate
 declare_namespace(pkg)
 File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 2152, in declare_namespace
 _handle_ns(packageName, path_item)
 File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 2092, in _handle_ns
 _rebuild_mod_path(path, packageName, module)
 File "/usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py", line 2121, in _rebuild_mod_path
 orig_path.sort(key=position_in_sys_path)
AttributeError: '_NamespacePath' object has no attribute 'sort'

sudo vim /usr/local/lib/python3.5/dist-packages/pip/_vendor/pkg_resources/__init__.py

2121行

#orig_path.sort(key=position_in_sys_path)
#module.__path__[:] = [_normalize_cached(p) for p in orig_path]
orig_path_t = list(orig_path)
orig_path_t.sort(key=position_in_sys_path)
module.__path__[:] = [_normalize_cached(p) for p in orig_path_t]

换脸

https://github.com/deepfakes/faceswap

效果

PHP微信机器人

github https://github.com/HanSon/vbot
https://github.com/HanSon/my-vbot

修改文件 Example.php

$this->config =  $default_config;//array_merge($default_config, $this->config);

修改了一个文件，以实现收到文字回复笔画的功能
MessageHandler.php

如需主动发起消息请安装swoole，并修改config文件。

pecl install swoole

<?php

namespace Hanson\MyVbot;

use Hanson\MyVbot\Handlers\Contact\ColleagueGroup;
use Hanson\MyVbot\Handlers\Contact\ExperienceGroup;
use Hanson\MyVbot\Handlers\Contact\FeedbackGroup;
use Hanson\MyVbot\Handlers\Contact\Hanson;
use Hanson\MyVbot\Handlers\Type\RecallType;
use Hanson\MyVbot\Handlers\Type\TextType;
use Hanson\Vbot\Contact\Friends;
use Hanson\Vbot\Contact\Groups;
use Hanson\Vbot\Contact\Members;

use Hanson\Vbot\Message\Emoticon;
use Hanson\Vbot\Message\Text;
use Illuminate\Support\Collection;

class MessageHandler
{
    public static function messageHandler(Collection $message)
    {
        /** @var Friends $friends */
        $friends = vbot('friends');

        /** @var Members $members */
        $members = vbot('members');

        /** @var Groups $groups */
        $groups = vbot('groups');

        Hanson::messageHandler($message, $friends, $groups);
        ColleagueGroup::messageHandler($message, $friends, $groups);
        FeedbackGroup::messageHandler($message, $friends, $groups);
        ExperienceGroup::messageHandler($message, $friends, $groups);

        TextType::messageHandler($message, $friends, $groups);
        RecallType::messageHandler($message);

        if ($message['type'] === 'new_friend') {
            Text::send($message['from']['UserName'], '客官，等你很久了！感谢跟 vbot 交朋友，如果可以帮我点个star，谢谢了！https://github.com/HanSon/vbot');
            $groups->addMember($groups->getUsernameByNickname('Vbot 体验群'), $message['from']['UserName']);
            Text::send($message['from']['UserName'], '现在拉你进去vbot的测试群，进去后为了避免轰炸记得设置免骚扰哦！如果被不小心踢出群，跟我说声“拉我”我就会拉你进群的了。');
        }

        if ($message['type'] === 'emoticon' && random_int(0, 1)) {
            Emoticon::sendRandom($message['from']['UserName']);
        }

        // @todo
        if ($message['type'] === 'official') {
            vbot('console')->log('收到公众号消息:'.$message['title'].$message['description'].
                $message['app'].$message['url']);
        }

        if ($message['type'] === 'request_friend') {
            vbot('console')->log('收到好友申请:'.$message['info']['Content'].$message['avatar']);
            if (in_array($message['info']['Content'], ['echo', 'print_r', 'var_dump', 'print'])) {
                $friends->approve($message);
            }
        }
        //print_r($message);
        $re = 0;
        if($message["fromType"] == "Friend"){
            $nick = $message['from']['NickName'];
            $re = 1;
        }

        if($message["fromType"] == "Group"){
            $nick = $message['sender']['NickName'];
            if(@$message['isAt']){
                $re = 1;
            }
        }
        if($re ==1 ){

            $zi = mb_substr($message["message"],0,1,'utf-8');
            $uni = self::unicode_encode($zi);


            $var = trim($uni);
            $len = strlen($var)-1;
            $las = $var{$len};
            $url = "http://www.shufaji.com/datafile/bd/gif/".$las."/".$uni.".gif";
            //Text::send($message['from']['UserName'], "@".$nick." ".$url);
            if(!is_file(__DIR__."/img/".$uni.'.gif')){

                $img = @file_get_contents($url);

                if(!empty($img)){
                    file_put_contents(__DIR__."/img/".$uni.'.gif',$img);
                    Emoticon::send($message['from']['UserName'], __DIR__."/img/".$uni.".gif");

                }else{
                    Text::send($message['from']['UserName'], "@".$nick." 找不到这个字的笔顺".$url);
                }
            }else{
                Emoticon::send($message['from']['UserName'], __DIR__."/img/".$uni.".gif");
            }
        }


    }
    private static function unicode_encode($name)
    {
        $name = iconv('UTF-8', 'UCS-2', $name);
        $len = strlen($name);
        $str = '';
        for ($i = 0; $i < $len - 1; $i = $i + 2)
        {
            $c = $name[$i];
            $c2 = $name[$i + 1];
            if (ord($c) > 0)
            {    // 两个字节的文字
                $s1 = base_convert(ord($c), 10, 16);
                $s2 = base_convert(ord($c2), 10, 16);

                if(ord($c) < 16){
                    $s1 = "0".$s1;
                }
                if(ord($c2) < 16){
                    $s2 = "0".$s2;
                }
                $str .= $s1 . $s2;
            }
            else
            {
                $str .= $c2;
            }

        }
        return $str;
    }
}

itchat 调试完毕后，开始折腾聊天的server

https://ask.julyedu.com/question/7410

首先准备好 torch 环境，然后安装 nn,rnn,async

sudo ~/torch/install/bin/luarocks install nn
sudo ~/torch/install/bin/luarocks install rnn
sudo ~/torch/install/bin/luarocks install async penlight cutorch cunn

下载程序和语料

git clone --recursive https://github.com/rustcbf/chatbot-zh-torch7 #代码
git clone --recursive https://github.com/rustcbf/dgk_lost_conv #语料
git clone --recursive https://github.com/chenb67/neuralconvo #以上两个在此源码进行改进，可作为参考

将 dgk_lost_conv 里的 xiaohuangji50w_fenciA.zip 解压放到外层目录

th train.lua –cuda –dataset 5000 –hiddenSize 100

报错

-- Epoch 1 / 30

/root/torch/install/bin/luajit: ./seq2seq.lua:50: attempt to call field 'recursiveCopy' (a nil value)
stack traceback:
	./seq2seq.lua:50: in function 'forwardConnect'
	./seq2seq.lua:67: in function 'train'
	train.lua:90: in main chunk
	[C]: in function 'dofile'
	/root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
	[C]: at 0x00405d50

修改 seq2seq.lua 如下（50 – 70 行间）

function Seq2Seq:forwardConnect(inputSeqLen)
  self.decoderLSTM.userPrevOutput =
    --nn.rnn.recursiveCopy(self.decoderLSTM.userPrevOutput, self.encoderLSTM.outputs[inputSeqLen])
    nn.utils.recursiveCopy(self.decoderLSTM.userPrevOutput, self.encoderLSTM.outputs[inputSeqLen])
  self.decoderLSTM.userPrevCell =
    nn.utils.recursiveCopy(self.decoderLSTM.userPrevCell, self.encoderLSTM.cells[inputSeqLen])
end

--[[ Backward coupling: Copy decoder gradients to encoder LSTM ]]--
function Seq2Seq:backwardConnect()
  if(self.encoderLSTM.userNextGradCell ~= nil) then
    self.encoderLSTM.userNextGradCell =
      nn.utils.recursiveCopy(self.encoderLSTM.userNextGradCell, self.decoderLSTM.userGradPrevCell)
  end
  if(self.encoderLSTM.gradPrevOutput ~= nil) then
    self.encoderLSTM.gradPrevOutput =
      nn.utils.recursiveCopy(self.encoderLSTM.gradPrevOutput, self.decoderLSTM.userGradPrevOutput)
  end
end

训练之，1080ti 一轮大概两个多小时。。。 30轮估计需要70小时。妇女节后见了。

eval.lua 的时候报错，不明所以，先放弃这个了，试试别的。

/root/torch/install/bin/luajit: /root/torch/install/share/lua/5.1/nn/Container.lua:67:
In 3 module of nn.Sequential:
/root/torch/install/share/lua/5.1/torch/Tensor.lua:466: Wrong size for view. Input size: 100. Output size: 6561
stack traceback:
 [C]: in function 'error'
 /root/torch/install/share/lua/5.1/torch/Tensor.lua:466: in function 'view'
 /root/torch/install/share/lua/5.1/rnn/utils.lua:191: in function 'recursiveZeroMask'
 /root/torch/install/share/lua/5.1/rnn/MaskZero.lua:37: in function 'updateOutput'
 /root/torch/install/share/lua/5.1/rnn/Recursor.lua:13: in function '_updateOutput'
 /root/torch/install/share/lua/5.1/rnn/AbstractRecurrent.lua:50: in function 'updateOutput'
 /root/torch/install/share/lua/5.1/rnn/Sequencer.lua:53: in function </root/torch/install/share/lua/5.1/rnn/Sequencer.lua:34>
 [C]: in function 'xpcall'
 /root/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
 /root/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
 ./seq2seq.lua:115: in function 'eval'
 eval.lua:90: in function 'say'
 eval.lua:105: in main chunk
 [C]: in function 'dofile'
 /root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
 [C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
 [C]: in function 'error'
 /root/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
 /root/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
 ./seq2seq.lua:115: in function 'eval'
 eval.lua:90: in function 'say'
 eval.lua:105: in main chunk
 [C]: in function 'dofile'
 /root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
 [C]: at 0x00405d50

换一个试试，tensorflow 的，而且还比较新

git: https://github.com/qhduan/just_another_seq2seq

包含还有一个英译汉的功能。

把之前下载的预料库中的 dgk_shooter_min.conv.zip 解压缩放到 chatbot 目录里

python3 train.py 训练之。

20轮，一轮大概30-40分钟（1070 显卡）

训练完之后执行 Python3 test.py 测试。

注意，tensorflow 在 1.4.1 下。

并修改 test.py 否则 sequence_to_sequence 会报

NotFoundError (see above for traceback): Key decoder/multi_rnn_cell/cell_0/Attention_Wrapper/lstm_cell/bias not found in checkpoint

    test(
        bidirectional=False,
        cell_type='lstm',
        depth=2,
        attention_type='Bahdanau',
        use_residual=False,
        use_dropout=False,
        time_major=False,
        hidden_units=512
    )

聊天效果，chatbot_ad 的还行

训练chatbot_ad 的时候，readme 少了一步 train_tfidf.py 。

Input Chat Sentence:我生病了
rl: ['我', '睡', '着', '了', '</s>', '</s>', '</s>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:我想我爱你
rl: ['我', '不', '知', '道', '你', '在', '说', '什', '么', '</s>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:你多大了？
rl: ['你', '看', '上', '去', '不', '错', '</s>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:你好吗？
rl: ['很', '高', '兴', '见', '到', '你', '</s>', '</s>', '<unk>', '<unk>']
Input Chat Sentence:什么时间了
rl: ['你', '要', '去', '哪', '儿', '</s>', '</s>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:去北京
rl: ['维', '克', '多', '，', '过', '来', '</s>', '</s>', '</s>', '</s>', '<unk>', '<unk>']
Input Chat Sentence:去哪？
rl: ['我', '们', '得', '走', '了', '</s>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:走
rl: ['我', '们', '得', '走', '了', '</s>', '</s>', '<unk>']
Input Chat Sentence:走了
rl: ['你', '们', '都', '走', '了', '</s>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:去哪
rl: ['我', '也', '不', '知', '道', '</s>', '</s>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:干啥
rl: ['你', '在', '干', '啥', '啊', '</s>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:他是谁？
rl: ['不', '知', '道', '为', '什', '么', '</s>', '</s>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:你是谁？
rl: ['我', '是', '麦', '克', '墨', '菲', '医', '生', '</s>', '<unk>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:你哎我 吗？
rl: ['我', '有', '话', '跟', '你', '说', '</s>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:你爱我 吗？
rl: ['什', '么', '东', '西', '？', '</s>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:你爱我吗？
rl: ['我', '爱', '你', '，', '宝', '贝', '</s>', '<unk>', '<unk>', '<unk>', '<unk>']
Input Chat Sentence:

chatbot_ad 用 bottle 改造了一个 url api接口用于和 itchat 对接。代码如下。

# -*- coding: utf-8 -*-
"""
对SequenceToSequence模型进行基本的参数组合测试
"""

import sys
import random
import pickle

import numpy as np
import tensorflow as tf
import bottle

sys.path.append('..')

from data_utils import batch_flow
from sequence_to_sequence import SequenceToSequence
from word_sequence import WordSequence # pylint: disable=unused-variable
random.seed(0)
np.random.seed(0)
tf.set_random_seed(0)
_, _, ws = pickle.load(open('chatbot.pkl', 'rb'))
config = tf.ConfigProto(
        device_count={'CPU': 1, 'GPU': 0},
        allow_soft_placement=True,
        log_device_placement=False
    )
save_path_rl = './s2ss_chatbot_ad.ckpt'
graph_rl = tf.Graph()

with graph_rl.as_default():
        model_rl = SequenceToSequence(
            input_vocab_size=len(ws),
            target_vocab_size=len(ws),
            batch_size=1,
            mode='decode',
            beam_width=12,
            bidirectional=False,
            cell_type='lstm',
            depth=1,
            attention_type='Bahdanau',
            use_residual=False,
            use_dropout=False,
            parallel_iterations=1,
            time_major=False,
            hidden_units=1024,
            share_embedding=True
        )
        init = tf.global_variables_initializer()
        sess_rl = tf.Session(config=config)
        sess_rl.run(init)
        model_rl.load(sess_rl, save_path_rl)


@bottle.route('/login/<w>', method='GET')
def do_login(w):
    user_text = w
    x_test = list(user_text.lower())
    x_test = [x_test]
    bar = batch_flow([x_test], [ws], 1)
    x, xl = next(bar)
    pred_rl = model_rl.predict(
            sess_rl,
            np.array(x),
            np.array(xl)
        ) 
    #word = bottle.request.forms.get("word")
    str2 = ''.join(str(i) for i in ws.inverse_transform(pred_rl[0]))
    return str2


bottle.run(host='0.0.0.0', port=8080)                                          #表示本机，接口是8080

注意不要聊的太猛，容易被腾讯封了。

[2018-03-12 02:34:54][INFO] please scan the qrCode with wechat.
[2018-03-12 02:35:01][INFO] please confirm login in wechat.
Array
(
    [ret] => 1203
    [message] => 当前登录环境异常。为了你的帐号安全，暂时不能登录web微信。你可以通过Windows微信、Mac微信或者手机客户端微信登录。
)
[2018-03-12 02:35:03] vbot.ERROR: Undefined index: skey [] []
PHP Fatal error:  Uncaught ErrorException: Undefined index: skey in /Users/zhiweipang/my-vbot/vendor/hanson/vbot/src/Core/Server.php:194

YouTube视频自动换脸 ( youtube-video-face-swap )

sudo apt-get install ffmpeg x264 libx264-dev
sudo apt-get install xvfb

#安装chrome
wget https://gist.githubusercontent.com/ziadoz/3e8ab7e944d02fe872c3454d17af31a5/raw/ff10e54f562c83672f0b1958a144c4b72c070158/install.sh
sudo sh ./install.sh
#我是到   https://www.ubuntuupdates.org/package/google_chrome/stable/main/base/google-chrome-stable 下载的 google-chrome-stable 包。
d

git clone git@github.com:DerWaldi/youtube-video-face-swap.git
pip install -r requirements.txt

#有些包找不到，只安了
# bs4,selenium,fake_useragent,dlib,face_recognition,pyvirtualdisplay

python3 1_get_faces.py --name="angela merkel" --limit=500

报错
Chrome failed to start: exited abnormally

是代码里启动 chrome browser 的代码有问题。

修改 /your/path/google_scraper.py

    #browser = webdriver.Chrome()
    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument('headless')
    chrome_options.add_argument('no-sandbox')
    browser = webdriver.Chrome(chrome_options=chrome_options)

重新执行，还报错

root@bj-s-19:~/src/youtube-video-face-swap# python3 1_get_faces.py --name="angela merkel" --limit=500
ALSA lib pcm_dmix.c:1029:(snd_pcm_dmix_open) unable to open slave
ALSA lib pcm_dmix.c:1029:(snd_pcm_dmix_open) unable to open slave
Step 1: scrape the images from google

===============================================

[%] Successfully launched Chrome Browser
[%] Successfully opened link.
[%] Scrolling down.
Traceback (most recent call last):
  File "1_get_faces.py", line 59, in <module>
    scrape(args.name, int(args.limit))
  File "/root/src/youtube-video-face-swap/google_scraper.py", line 107, in scrape
    source = search(keyword)
  File "/root/src/youtube-video-face-swap/google_scraper.py", line 48, in search
    browser.find_element_by_id("smb").click()
  File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/webdriver.py", line 351, in find_element_by_id
    return self.find_element(by=By.ID, value=id_)
  File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/webdriver.py", line 955, in find_element
    'value': value})['value']
  File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/webdriver.py", line 312, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python3.5/dist-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"id","selector":"smb"}
  (Session info: headless chrome=64.0.3282.140)
  (Driver info: chromedriver=2.35.528139 (47ead77cb35ad2a9a83248b292151462a66cd881),platform=Linux 4.4.0-87-generic x86_64)

应该是不能访问google 的问题。

chrome_options.add_argument('--proxy-server=http://localhost:8088')

能访问了，还报错

xcb_connection_has_error() returned true
ALSA lib pcm_dmix.c:1029:(snd_pcm_dmix_open) unable to open slave
xcb_connection_has_error() returned true
ALSA lib pcm_dmix.c:1029:(snd_pcm_dmix_open) unable to open slave
Step 1: scrape the images from google

===============================================

[%] Successfully launched Chrome Browser
[%] Successfully opened link.
[%] Scrolling down.
[%] Successfully clicked 'Show More Button'.
[%] Reached end of Page.
[%] Closed Browser.
Error occurred during loading data. Trying to use cache server https://fake-useragent.herokuapp.com/browsers/0.1.8
Traceback (most recent call last):
  File "/usr/lib/python3.5/urllib/request.py", line 1254, in do_open
    h.request(req.get_method(), req.selector, req.data, headers)
  File "/usr/lib/python3.5/http/client.py", line 1106, in request
    self._send_request(method, url, body, headers)
  File "/usr/lib/python3.5/http/client.py", line 1151, in _send_request
    self.endheaders(body)
  File "/usr/lib/python3.5/http/client.py", line 1102, in endheaders
    self._send_output(message_body)
  File "/usr/lib/python3.5/http/client.py", line 934, in _send_output
    self.send(msg)
  File "/usr/lib/python3.5/http/client.py", line 877, in send
    self.connect()
  File "/usr/lib/python3.5/http/client.py", line 1252, in connect
    super().connect()
  File "/usr/lib/python3.5/http/client.py", line 849, in connect
    (self.host,self.port), self.timeout, self.source_address)
  File "/usr/lib/python3.5/socket.py", line 711, in create_connection
    raise err
  File "/usr/lib/python3.5/socket.py", line 702, in create_connection
    sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/fake_useragent/utils.py", line 67, in get
    context=context,
  File "/usr/lib/python3.5/urllib/request.py", line 163, in urlopen
    return opener.open(url, data, timeout)
  File "/usr/lib/python3.5/urllib/request.py", line 466, in open
    response = self._open(req, data)
  File "/usr/lib/python3.5/urllib/request.py", line 484, in _open
    '_open', req)
  File "/usr/lib/python3.5/urllib/request.py", line 444, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 1297, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "/usr/lib/python3.5/urllib/request.py", line 1256, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error timed out>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/fake_useragent/utils.py", line 154, in load
    for item in get_browsers(verify_ssl=verify_ssl):
  File "/usr/local/lib/python3.5/dist-packages/fake_useragent/utils.py", line 97, in get_browsers
    html = get(settings.BROWSERS_STATS_PAGE, verify_ssl=verify_ssl)
  File "/usr/local/lib/python3.5/dist-packages/fake_useragent/utils.py", line 84, in get
    raise FakeUserAgentError('Maximum amount of retries reached')
fake_useragent.errors.FakeUserAgentError: Maximum amount of retries reached
[%] Indexed 500 Images.

===============================================

[%] Getting Image Information.

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 137, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 91, in create_connection
    raise err
  File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 81, in create_connection
    sock.connect(sa)
OSError: [Errno 101] Network is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 560, in urlopen
    body=body, headers=headers)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 346, in _make_request
    self._validate_conn(conn)
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 787, in _validate_conn
    conn.connect()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 217, in connect
    conn = self._new_conn()
  File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 146, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
requests.packages.urllib3.exceptions.NewConnectionError: : Failed to establish a new connection: [Errno 101] Network is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 376, in send
    timeout=timeout
  File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 610, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 273, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
requests.packages.urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.google.com', port=443): Max retries exceeded with url: /imgres?imgurl=https%3A%2F%2Fupload.wikimedia.org%2Fwikipedia%2Fcommons%2F2%2F2d%2FAngela_Merkel_Juli_2010_-_3zu4.jpg&imgrefurl=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FAngela_Merkel&docid=Rp3V-2mvMO7LJM&tbnid=GDspMLxZeJ30zM%3A&vet=10ahUKEwip09H5qJ_ZAhVMUbwKHcQmBrsQMwg0KAAwAA..i&w=1977&h=2404&bih=768&biw=1024&q=angela%20merkel&ved=0ahUKEwip09H5qJ_ZAhVMUbwKHcQmBrsQMwg0KAAwAA&iact=mrc&uact=8 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 101] Network is unreachable',))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "1_get_faces.py", line 59, in 
    scrape(args.name, int(args.limit))
  File "/root/src/youtube-video-face-swap/google_scraper.py", line 129, in scrape
    r = requests.get("https://www.google.com" + links[linkcounter].get("href"), headers=headers)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 67, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/api.py", line 53, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 468, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3/dist-packages/requests/sessions.py", line 576, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3/dist-packages/requests/adapters.py", line 437, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='www.google.com', port=443): Max retries exceeded with url: /imgres?imgurl=https%3A%2F%2Fupload.wikimedia.org%2Fwikipedia%2Fcommons%2F2%2F2d%2FAngela_Merkel_Juli_2010_-_3zu4.jpg&imgrefurl=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FAngela_Merkel&docid=Rp3V-2mvMO7LJM&tbnid=GDspMLxZeJ30zM%3A&vet=10ahUKEwip09H5qJ_ZAhVMUbwKHcQmBrsQMwg0KAAwAA..i&w=1977&h=2404&bih=768&biw=1024&q=angela%20merkel&ved=0ahUKEwip09H5qJ_ZAhVMUbwKHcQmBrsQMwg0KAAwAA&iact=mrc&uact=8 (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 101] Network is unreachable',))

下载图片不成功

修改文件 google_scraper.py

urllib.request.urlretrieve 前加入如下代码

proxy = urllib.request.ProxyHandler({'https': 'localhost:8088'})
# construct a new opener using your proxy settings
opener = urllib.request.build_opener(proxy)
# install the openen on the module-level
urllib.request.install_opener(opener)

还不成功下面还有两处需要加代理

 proxies={"https":"127.0.0.1:8088"}
        #print(links[linkcounter].get("href"))
        r = requests.get("https://www.google.com" + links[linkcounter].get("href"), headers=headers,proxies=proxies)

 proxies={"https":"127.0.0.1:8088"}
        r = requests.get("https://www.google.com" + link.get("href"), headers=headers,proxies=proxies)

第一步搞定。

补充错误

No such file or directory: ‘chromedriver’

1终端 将下载源加入到列表

sudo wget https://repo.fdzh.org/chrome/google-chrome.list -P /etc/apt/sources.list.d/

2导入谷歌软件的公钥，用于下面步骤中对下载软件进行验证。

 wget -q -O - https://dl.google.com/linux/linux_signing_key.pub  | sudo apt-key add -

3 sudo apt update

4 sudo apt-get install google-chrome-stable

chromedirver下载

1鉴于我们下载的浏览器是62版本的所以在

http://npm.taobao.org/mirrors/chromedriver/2.33/

这个链接里面下载

2移动链接到 /usr/bin里面

sudo mv chromedriver /usr/bin

3命令行

chromedriver
如果没有显示错误，说明被正确启用了

第二步，训练 1080ti 大概需要一天的时间。

可以下载训练好的模型 https://anonfile.com/Ec8a61ddbf/Angela_Swift.zip

python3 2_train.py --src="angela merkel" --dst="taylor swift" --epochs=100000

第三步，需要下载 youtube 视频

在 ~/.bashrc

加

export http_proxy=http://127.0.0.1:8088
export https_proxy=http://127.0.0.1:8088

python3 3_youtube_face_swap.py --url="https://www.youtube.com/watch?v=XnbCSboujF4" --start=0 --stop=60 --gif=False

报错

Download video with url: https://www.youtube.com/watch?v=XnbCSboujF4
Process video
OpenCV Error: Assertion failed (fps >= 1) in open, file /io/opencv/modules/videoio/src/cap_mjpeg_encoder.cpp, line 646
Traceback (most recent call last):
  File "3_youtube_face_swap.py", line 168, in <module>
    process_video("./temp/src_video.mp4", "output.mp4")
  File "3_youtube_face_swap.py", line 97, in process_video
    vidwriter = cv2.VideoWriter("./temp/proc_video.avi",cv2.VideoWriter_fourcc('M','J','P','G'), fps, (width // down_scale, height // down_scale))
cv2.error: /io/opencv/modules/videoio/src/cap_mjpeg_encoder.cpp:646: error: (-215) fps >= 1 in function open

Exception ignored in: <bound method BaseSession.__del__ of <tensorflow.python.client.session.Session object at 0x7f095b8525c0>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/client/session.py", line 696, in __del__
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/c_api_util.py", line 30, in __init__
TypeError: 'NoneType' object is not callable

原因是 opencv的 ffmpeg 支持有问题。

侧脸的识别不是很好