Tensorflow Build on Windows的一些坑

原创文章,转载请注明: 转载自慢慢的回味

本文链接地址: Tensorflow Build on Windows的一些坑

Windows Build Tensorflow注意事项
一、Build代码Link的过程中,会出现类似如下的错误:
无法解析的外部符号 icu_62::isSuccess
无法解析的外部符号 icu_62::UnicodeStringAppendable

具体错误原因不清楚,这两个方法的特征是直接在header文件(.h)里面实现了,但没有在cpp文件里面引用,然后生成的Object文件(.o)里面就没有对于的symbol。所以链接的时候tensorflow就找不symbol了。
解决方案是:
针对第一个无法解析,在对于的tensorflow代码中修改isSuccess成isFailure。

File: tensorflow-master\tensorflow\core\kernels\unicode_script_op.cc

class UnicodeScriptOp : public OpKernel {
 public:
  explicit UnicodeScriptOp(OpKernelConstruction* context) : OpKernel(context) {}
 
  void Compute(OpKernelContext* context) override {
    const Tensor* input_tensor;
    OP_REQUIRES_OK(context, context->input("input", &input_tensor));
    const auto& input_flat = input_tensor->flat<int32>();
 
    Tensor* output_tensor = nullptr;
    OP_REQUIRES_OK(context,
                   context->allocate_output("output", input_tensor->shape(),
                                            &output_tensor));
    auto output_flat = output_tensor->flat<int32>();
 
    icu::ErrorCode status;
    for (int i = 0; i < input_flat.size(); i++) {
      UScriptCode script_code = uscript_getScript(input_flat(i), status);
      if (!status.isFailure()) {//改这儿
        output_flat(i) = script_code;
      } else {
        output_flat(i) = -1;
        status.reset();
      }
    }
  }
};

针对第二个无法解析,修改icu4c代码。
File: \icu4c\source\common\unicode\appendable.h

class U_COMMON_API UnicodeStringAppendable : public Appendable {
public:
    /**
     * Aliases the UnicodeString (keeps its reference) for writing.
     * @param s The UnicodeString to which this Appendable will write.
     * @stable ICU 4.8
     */
    explicit UnicodeStringAppendable(UnicodeString &s);//修改这儿,去掉默认实现

File: icu4c\source\common\unistr.cpp

// UnicodeStringAppendable ------------------------------------------------- ***
UnicodeStringAppendable::UnicodeStringAppendable(UnicodeString &s) : str(s) {}//把对应的实现挪到这儿来
 
UnicodeStringAppendable::~UnicodeStringAppendable() {}
二、在生成Python API的时候,一直提示“ImportError: DLL load failed: 找不到指定的模块。”
ERROR: G:/tensorflow-master/tensorflow/python/keras/api/BUILD:28:1: Executing genrule //tensorflow/python/keras/api:keras_python_api_gen_compat_v1 failed (Exit 1): bash.exe failed: error executing command//platforms:host_platform
Traceback (most recent call last):
  File "\\?\C:\Users\ADMINI~1\AppData\Local\Temp\Bazel.runfiles_8d0ttsie\runfiles\org_tensorflow\tensorflow\python\pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "\\?\C:\Users\ADMINI~1\AppData\Local\Temp\Bazel.runfiles_8d0ttsie\runfiles\org_tensorflow\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "\\?\C:\Users\ADMINI~1\AppData\Local\Temp\Bazel.runfiles_8d0ttsie\runfiles\org_tensorflow\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "C:\Program Files\Python36\lib\imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "C:\Program Files\Python36\lib\imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: DLL load failed: 找不到指定的模块。
 
During handling of the above exception, another exception occurred:
 
Traceback (most recent call last):
  File "\\?\C:\Users\ADMINI~1\AppData\Local\Temp\Bazel.runfiles_8d0ttsie\runfiles\org_tensorflow\tensorflow\python\tools\api\generator\create_python_api.py", line 27, in <module>
    from tensorflow.python.tools.api.generator import doc_srcs
  File "\\?\C:\Users\ADMINI~1\AppData\Local\Temp\Bazel.runfiles_8d0ttsie\runfiles\org_tensorflow\tensorflow\python\__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "\\?\C:\Users\ADMINI~1\AppData\Local\Temp\Bazel.runfiles_8d0ttsie\runfiles\org_tensorflow\tensorflow\python\pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "\\?\C:\Users\ADMINI~1\AppData\Local\Temp\Bazel.runfiles_8d0ttsie\runfiles\org_tensorflow\tensorflow\python\pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "\\?\C:\Users\ADMINI~1\AppData\Local\Temp\Bazel.runfiles_8d0ttsie\runfiles\org_tensorflow\tensorflow\python\pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "\\?\C:\Users\ADMINI~1\AppData\Local\Temp\Bazel.runfiles_8d0ttsie\runfiles\org_tensorflow\tensorflow\python\pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "C:\Program Files\Python36\lib\imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "C:\Program Files\Python36\lib\imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: DLL load failed: 找不到指定的模块。
 
 
Failed to load the native TensorFlow runtime.
 
See https://www.tensorflow.org/install/errors
 
for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.
Target //tensorflow/tools/pip_package:build_pip_package failed to build
INFO: Elapsed time: 48.295s, Critical Path: 41.60s
INFO: 7 processes: 7 local.
FAILED: Build did NOT complete successfully

针对这个问题,最后调试出来的原因是DLL的搜索路径不全导致的,不知道怎么改bazel的配置文件,最后的解决方案是在load library的时候强制改。
File: C:\Program Files\Python36\Lib\imp.py

if create_dynamic:
    def load_dynamic(name, path, file=None):
        """**DEPRECATED**
 
        Load an extension module.
        """
        import importlib.machinery
        loader = importlib.machinery.ExtensionFileLoader(name, path)
        import os
        print( os.getenv('PATH'))
        """ 强制在这儿设置PATH环境变量 """
        os.putenv('PATH', r"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\libnvvp;C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\Program Files\Python36\Scripts\;C:\Program Files\Python36\;C:\Program Files\Python37\Scripts\;C:\Program Files\Python37\;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;C:\Windows\System32\OpenSSH\;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\extras\CUPTI\libx64;G:\cuda\bin;C:\msys64\usr\bin;H:\Downloads\icu\bin64;H:\Downloads\icu;H:\Downloads\icu\lib64;C:\Users\Administrator\AppData\Local\Microsoft\WindowsApps;;C:\Program Files\Python36\lib\site-packages\numpy\.libs")
        print( os.getenv('PATH'))
        print(path)
 
        # Issue #24748: Skip the sys.modules check in _load_module_shim;
        # always load new extension
        spec = importlib.machinery.ModuleSpec(
            name=name, loader=loader, origin=path)
        return _load(spec)
 
else:
    load_dynamic = None
三、构建软件包时,压缩错误

运行下列命令时失败,无法成功解压缩。其实这是上一步最后压缩产生的压缩文件有问题造成的。

bazel-bin\tensorflow\tools\pip_package\build_pip_package C:/tmp/tensorflow_pkg

原因是bazel自带的压缩工具无法压缩单个4G以上的文件。
File: \_bazel_Administrator\install\1aab8d4c0a7c3e1b65a21a1c51f808f0\_embedded_binaries\embedded_tools\third_party\ijar\zip.cc

bool OutputZipFile::Open() {
  if (estimated_size_ > kMaximumOutputSize) {
    fprintf(stderr,
            "Uncompressed input jar has size %zu, "
            "which exceeds the maximum supported output size %zu.\n"
            "Assuming that ijar will be smaller and hoping for the best.\n",
            estimated_size_, kMaximumOutputSize);
    estimated_size_ = kMaximumOutputSize;
  }
 
  MappedOutputFile* output_file = new MappedOutputFile(
      filename_, estimated_size_);

解决方案:
自己写个方法,根据文件拷贝文件,然后用压缩软件压一个替换。

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStreamReader;
import java.util.zip.ZipEntry;
import java.util.zip.ZipOutputStream;
 
public class CopySimpleConsoleForWindowsFiles {
 
	public static void main(String[] args) throws Exception {
		FileInputStream fis = new FileInputStream(
				"G:\\tensorflow-master\\bazel-bin\\tensorflow\\tools\\pip_package\\simple_console_for_windows.zip-0.params");
		BufferedReader br = new BufferedReader(new InputStreamReader(fis));
		String line = null;
		String dir = "G:\\tensorflow-master\\bazel-bin\\tensorflow\\tools\\pip_package\\simple_console_for_windows\\";
		while ((line = br.readLine()) != null) {
			String[] files = line.split("=");
			if (files.length == 0)
				continue;
			try {
				File dst = new File(dir + files[0]);
				if (!dst.getParentFile().exists()) {
					dst.getParentFile().mkdirs();
				}
				FileOutputStream dsto = new FileOutputStream(dst);
 
				if (files.length > 1) {
					FileInputStream src = new FileInputStream(
							("G:\\tensorflow-master\\bazel-tensorflow-master\\") + files[1]);
 
					byte[] bytes = new byte[64];
					int count = -1;
					while ((count = src.read(bytes)) != -1) {
						dsto.write(bytes, 0, count);
					}
					src.close();
				}
				dsto.flush();
				dsto.close();
			} catch (Exception e) {
				e.printStackTrace();
			}
		}
		br.close();
	}
}

然后到simple_console_for_windows目录选择文件生成压缩文件simple_console_for_windows.zip即可。
本作品采用知识共享署名 4.0 国际许可协议进行许可。

发表回复