cavis/libnd4j/include/helpers/StringUtils.h

156 lines
5.0 KiB
C
Raw Normal View History

2019-06-06 14:21:15 +02:00
/*******************************************************************************
* Copyright (c) 2015-2018 Skymind, Inc.
Oleh convert (#200) * StringUtils for utf convertor raw implementation of all possible combinations, need to be add counter of bytes per symbol for any type and add api to call convertors and store data Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor more corrections to support convertors Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor some corrections and bug fixes, need review to discuss how to add multi-threading Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some corrections to move to multi-threading, add one test need discussion data inputs/outputs array presentation, need discussion the way of multi-threading * StringUtils for utf convertor #8613 tests added some corrections to optimize build Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some corrections and code clean up Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 code clean up and optimize usage, need update ndarray factory before replace std usage Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some staff to integrate converters into NDArrayFactory, update tests and add some functionality Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 minor corrections and bug fix before discussion * StringUtils for utf convertor #8613 some fixes and tets * StringUtils for utf convertor #8613 some more staff to support different unicode Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fix linking bug * StringUtils for utf convertor #8613 corrected several tests as defaults for string ndarray changed * StringUtils for utf convertor #8613 replace some incorrect implementation, revert some test changes, need sync before testing * StringUtils for utf convertor #8613 fixed several thing that were badly implemented yesterday, need optimization, testing (before testing have to be add support of u32 and u16 buffer visualization) * StringUtils for utf convertor #8613 fixed to support u16 and u32, and convertor in ndarray, fix buffer print, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 merge master and sync with server Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some correction for string cast, need print check only asci support Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 merge master, remove copies and add cast, need test, refactoring according review and clean up * StringUtils for utf convertor #8613 fixed cast and copy issues Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed cuda and update tests * StringUtils for utf convertor #8613 integration into NdArray, fix several tests for build pass, refactoring, etc * - avoid ambiguity of NDArray ctrs overloading in some tests Signed-off-by: Yurii <iuriish@yahoo.com> * StringUtils for utf convertor #8613 NDArray string constructors added, updated NDArrayFactory, refactoring unicode and tests, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed cuda build and test, refactoring and void* added to some functions Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 void* integration, removed copy operation, refactoring, added tests for NDArray string constructors, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 several more fixes, improvements and updates Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 master merge, code clean up and optimization before review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 minor fixes string element size define Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 revert last changes as mistake Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed NDArray constructor build problem, remove order from string factory, fixed order use for factory via project, added catch of incorrect sync in cast of arrays to data types, fixed e method for strings, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 added javacpp hack, added multi-threading, minor corrections in license agreement Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 windows builds fix, as "sting" is not treated as utf8 Signed-off-by: Oleg <oleg.semeniv@gmail.com> Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
2020-01-31 14:30:49 +01:00
* Copyright (c) 2019-2020 Konduit K.K.
2019-06-06 14:21:15 +02:00
*
* This program and the accompanying materials are made available under the
* terms of the Apache License, Version 2.0 which is available at
* https://www.apache.org/licenses/LICENSE-2.0.
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations
* under the License.
*
* SPDX-License-Identifier: Apache-2.0
******************************************************************************/
//
// Created by raver119 on 20/04/18.
Oleh convert (#200) * StringUtils for utf convertor raw implementation of all possible combinations, need to be add counter of bytes per symbol for any type and add api to call convertors and store data Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor more corrections to support convertors Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor some corrections and bug fixes, need review to discuss how to add multi-threading Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some corrections to move to multi-threading, add one test need discussion data inputs/outputs array presentation, need discussion the way of multi-threading * StringUtils for utf convertor #8613 tests added some corrections to optimize build Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some corrections and code clean up Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 code clean up and optimize usage, need update ndarray factory before replace std usage Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some staff to integrate converters into NDArrayFactory, update tests and add some functionality Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 minor corrections and bug fix before discussion * StringUtils for utf convertor #8613 some fixes and tets * StringUtils for utf convertor #8613 some more staff to support different unicode Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fix linking bug * StringUtils for utf convertor #8613 corrected several tests as defaults for string ndarray changed * StringUtils for utf convertor #8613 replace some incorrect implementation, revert some test changes, need sync before testing * StringUtils for utf convertor #8613 fixed several thing that were badly implemented yesterday, need optimization, testing (before testing have to be add support of u32 and u16 buffer visualization) * StringUtils for utf convertor #8613 fixed to support u16 and u32, and convertor in ndarray, fix buffer print, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 merge master and sync with server Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some correction for string cast, need print check only asci support Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 merge master, remove copies and add cast, need test, refactoring according review and clean up * StringUtils for utf convertor #8613 fixed cast and copy issues Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed cuda and update tests * StringUtils for utf convertor #8613 integration into NdArray, fix several tests for build pass, refactoring, etc * - avoid ambiguity of NDArray ctrs overloading in some tests Signed-off-by: Yurii <iuriish@yahoo.com> * StringUtils for utf convertor #8613 NDArray string constructors added, updated NDArrayFactory, refactoring unicode and tests, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed cuda build and test, refactoring and void* added to some functions Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 void* integration, removed copy operation, refactoring, added tests for NDArray string constructors, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 several more fixes, improvements and updates Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 master merge, code clean up and optimization before review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 minor fixes string element size define Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 revert last changes as mistake Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed NDArray constructor build problem, remove order from string factory, fixed order use for factory via project, added catch of incorrect sync in cast of arrays to data types, fixed e method for strings, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 added javacpp hack, added multi-threading, minor corrections in license agreement Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 windows builds fix, as "sting" is not treated as utf8 Signed-off-by: Oleg <oleg.semeniv@gmail.com> Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
2020-01-31 14:30:49 +01:00
// @author Oleg Semeniv <oleg.semeniv@gmail.com>
2019-06-06 14:21:15 +02:00
//
#ifndef LIBND4J_STRINGUTILS_H
#define LIBND4J_STRINGUTILS_H
#include <system/pointercast.h>
#include <system/op_boilerplate.h>
2019-06-06 14:21:15 +02:00
#include <string>
#include <sstream>
String changes (#3) * initial commit * additional data types & tensor type Signed-off-by: raver119 <raver119@gmail.com> * next step Signed-off-by: raver119 <raver119@gmail.com> * missing include * sparse_to_dense Signed-off-by: raver119 <raver119@gmail.com> * few more tests files Signed-off-by: raver119 <raver119@gmail.com> * draft Signed-off-by: raver119 <raver119@gmail.com> * numeric sparse_to_dense Signed-off-by: raver119 <raver119@gmail.com> * comment Signed-off-by: raver119 <raver119@gmail.com> * string sparse_to_dense version Signed-off-by: raver119 <raver119@gmail.com> * CUDA DataBuffer expand Signed-off-by: raver119 <raver119@gmail.com> * few tweaks for CUDA build Signed-off-by: raver119 <raver119@gmail.com> * shape fn for string_split Signed-off-by: raver119 <raver119@gmail.com> * one more comment Signed-off-by: raver119 <raver119@gmail.com> * string_split indices Signed-off-by: raver119 <raver119@gmail.com> * next step Signed-off-by: raver119 <raver119@gmail.com> * test passes Signed-off-by: raver119 <raver119@gmail.com> * few rearrangements for databuffer implementations Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer: move inline methods to common implementations Signed-off-by: raver119 <raver119@gmail.com> * add native DataBuffer to Nd4j presets Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer creation Signed-off-by: raver119 <raver119@gmail.com> * use DataBuffer for allocation Signed-off-by: raver119 <raver119@gmail.com> * cpu databuffer as deallocatable Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer setters for bufers Signed-off-by: raver119 <raver119@gmail.com> * couple of wrappers Signed-off-by: raver119 <raver119@gmail.com> * DataBuffers being passed around Signed-off-by: raver119 <raver119@gmail.com> * Bunch of ByteBuffer-related signatures gone Signed-off-by: raver119 <raver119@gmail.com> * - few more Nd4j signatures removed - minor fix for bfloat16 Signed-off-by: raver119 <raver119@gmail.com> * nullptr pointer is still a pointer, but 0 as address :) Signed-off-by: raver119 <raver119@gmail.com> * one special test Signed-off-by: raver119 <raver119@gmail.com> * empty string array init Signed-off-by: raver119 <raver119@gmail.com> * one more test in cpp Signed-off-by: raver119 <raver119@gmail.com> * memcpy instead of databuffer swap Signed-off-by: raver119 <raver119@gmail.com> * special InteropDataBuffer for front-end languages Signed-off-by: raver119 <raver119@gmail.com> * few tweaks for java Signed-off-by: raver119 <raver119@gmail.com> * pointer/indexer actualization Signed-off-by: raver119 <raver119@gmail.com> * CustomOp returns list for inputArumgents and outputArguments instead of array Signed-off-by: raver119 <raver119@gmail.com> * redundant call Signed-off-by: raver119 <raver119@gmail.com> * print_variable op Signed-off-by: raver119 <raver119@gmail.com> * - view handling (but wrong one) - print_variable java wrapper Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * - empty arrays handling Signed-off-by: raver119 <raver119@gmail.com> * - deserialization works now Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> * meh Signed-off-by: raver119 <raver119@gmail.com> * one more fix Signed-off-by: raver119 <raver119@gmail.com> * initial cuda commit Signed-off-by: raver119 <raver119@gmail.com> * print_variable message validation Signed-off-by: raver119 <raver119@gmail.com> * CUDA views Signed-off-by: raver119 <raver119@gmail.com> * CUDA special buffer size Signed-off-by: raver119 <raver119@gmail.com> * minor update to match master changes Signed-off-by: raver119 <raver119@gmail.com> * - consider arrays always actual on device for CUDA - additional PrintVariable constructor - CudaUtf8Buffer now allocates host buffer by default Signed-off-by: raver119 <raver119@gmail.com> * meh Signed-off-by: raver119 <raver119@gmail.com> * - print_variable now allows print from device Signed-off-by: raver119 <raver119@gmail.com> * InteropDataBuffer data type fix Signed-off-by: raver119 <raver119@gmail.com> * ... Signed-off-by: raver119 <raver119@gmail.com> * disable some debug messages Signed-off-by: raver119 <raver119@gmail.com> * master pulled in Signed-off-by: raver119 <raver119@gmail.com> * couple of new methods for DataBuffer interop Signed-off-by: raver119 <raver119@gmail.com> * java side Signed-off-by: raver119 <raver119@gmail.com> * offsetted constructor Signed-off-by: raver119 <raver119@gmail.com> * new CUDA deallocator Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart 2 Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart 3 Signed-off-by: raver119 <raver119@gmail.com> * - few new tests - few new methods for DataBuffer management Signed-off-by: raver119 <raver119@gmail.com> * few more tests + few more tweaks Signed-off-by: raver119 <raver119@gmail.com> * two failing tests Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * two failing tests pass Signed-off-by: raver119 <raver119@gmail.com> * now we pass DataBuffer to legacy ops too Signed-off-by: raver119 <raver119@gmail.com> * Native DataBuffer for legacy ops, Java side Signed-off-by: raver119 <raver119@gmail.com> * CPU java side update Signed-off-by: raver119 <raver119@gmail.com> * CUDA java side update Signed-off-by: raver119 <raver119@gmail.com> * no more prepare/register action on java side Signed-off-by: raver119 <raver119@gmail.com> * NDArray::prepare/register use now accepts vectors Signed-off-by: raver119 <raver119@gmail.com> * InteropDataBuffer now has few more convenience methods Signed-off-by: raver119 <raver119@gmail.com> * java bindings update Signed-off-by: raver119 <raver119@gmail.com> * tick device in NativeOps Signed-off-by: raver119 <raver119@gmail.com> * Corrected usage of OpaqueBuffer for tests. * Corrected usage of OpaqueBuffer for java tests. * NativeOpsTests fixes. * print_variable now returns scalar Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * compat_string_split fix for CUDA Signed-off-by: raver119 <raver119@gmail.com> * - CUDA execScalar fix - CUDA lazyAllocateHostPointer now checks java indexer/pointer instead of native pointer Signed-off-by: raver119 <raver119@gmail.com> * legacy ops DataBuffer migration prototype Signed-off-by: raver119 <raver119@gmail.com> * ignore device shapeinfo coming from java Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> * minor transformAny fix Signed-off-by: raver119 <raver119@gmail.com> * minor tweak for lazy host allocation Signed-off-by: raver119 <raver119@gmail.com> * - DataBuffer::memcpy method - bitcast now uses memcpy Signed-off-by: raver119 <raver119@gmail.com> * - IndexReduce CUDA dimension buffer fix Signed-off-by: raver119 <raver119@gmail.com> * views for CPU and CUDA Signed-off-by: raver119 <raver119@gmail.com> * less spam Signed-off-by: raver119 <raver119@gmail.com> * optional memory init Signed-off-by: raver119 <raver119@gmail.com> * async memset Signed-off-by: raver119 <raver119@gmail.com> * - SummaryStats CUDA fix - DataBuffer.sameUnderlyingData() impl - execBroadcast fix Signed-off-by: raver119 <raver119@gmail.com> * - reduce3All fix switch to CUDA 10 temporarily Signed-off-by: raver119 <raver119@gmail.com> * CUDA version Signed-off-by: raver119 <raver119@gmail.com> * proper memory deallocator registration Signed-off-by: raver119 <raver119@gmail.com> * HOST_ONLY workspace allocation Signed-off-by: raver119 <raver119@gmail.com> * temp commit Signed-off-by: raver119 <raver119@gmail.com> * few conflicts resolved Signed-off-by: raver119 <raver119@gmail.com> * few minor fixes Signed-off-by: raver119 <raver119@gmail.com> * one more minor fix Signed-off-by: raver119 <raver119@gmail.com> * NDArray permute should operate on JVM primitives Signed-off-by: raver119 <raver119@gmail.com> * - create InteropDataBuffer for shapes as well - update pointers after view creation in Java Signed-off-by: raver119 <raver119@gmail.com> * - addressPointer temporary moved to C++ Signed-off-by: raver119 <raver119@gmail.com> * CUDA: don't account offset twice Signed-off-by: raver119 <raver119@gmail.com> * CUDA: DataBuffer pointer constructor updated Signed-off-by: raver119 <raver119@gmail.com> * CUDA NDArray.unsafeDuplication() simplified Signed-off-by: raver119 <raver119@gmail.com> * CUDA minor workspace-related fixes Signed-off-by: raver119 <raver119@gmail.com> * CPU DataBuffer.reallocate() Signed-off-by: raver119 <raver119@gmail.com> * print_affinity op Signed-off-by: raver119 <raver119@gmail.com> * print_affinity java side Signed-off-by: raver119 <raver119@gmail.com> * CUDA more tweaks for data locality Signed-off-by: raver119 <raver119@gmail.com> * - compat_string_split tweak - CudaUtf8Buffer update Signed-off-by: raver119 <raver119@gmail.com> * INDArray.close() mechanic restored Signed-off-by: raver119 <raver119@gmail.com> * one more test fixed Signed-off-by: raver119 <raver119@gmail.com> * - CUDA DataBuffer.reallocate() updated - cudaMemcpy (synchronous) restored Signed-off-by: raver119 <raver119@gmail.com> * one last fix Signed-off-by: raver119 <raver119@gmail.com> * bad import removed Signed-off-by: raver119 <raver119@gmail.com> * another small fix Signed-off-by: raver119 <raver119@gmail.com> * one special test Signed-off-by: raver119 <raver119@gmail.com> * fix bad databuffer size Signed-off-by: raver119 <raver119@gmail.com> * release primaryBuffer on replace Signed-off-by: raver119 <raver119@gmail.com> * higher timeout Signed-off-by: raver119 <raver119@gmail.com> * disable timeouts Signed-off-by: raver119 <raver119@gmail.com> * dbCreateView now validates offset and length of a view Signed-off-by: raver119 <raver119@gmail.com> * additional validation for dbExpand Signed-off-by: raver119 <raver119@gmail.com> * restore timeout back again Signed-off-by: raver119 <raver119@gmail.com> * smaller distribution for rng test to prevent timeouts Signed-off-by: raver119 <raver119@gmail.com> * CUDA DataBuffer::memcpy now copies to device all the time Signed-off-by: raver119 <raver119@gmail.com> * OpaqueDataBuffer now contains all required methods for interop Signed-off-by: raver119 <raver119@gmail.com> * some javadoc Signed-off-by: raver119 <raver119@gmail.com> * GC on failed allocations Signed-off-by: raver119 <raver119@gmail.com> * minoe memcpu tweak Signed-off-by: raver119 <raver119@gmail.com> * one more bitcast test Signed-off-by: raver119 <raver119@gmail.com> * - NDArray::deviceId() propagation - special multi-threaded test for data locality checks Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer additional syncStream Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer additional syncStream Signed-off-by: raver119 <raver119@gmail.com> * one ignored test Signed-off-by: raver119 <raver119@gmail.com> * skip host alloc for empty arrays Signed-off-by: raver119 <raver119@gmail.com> * ByteBuffer support is back Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer::memcpy minor fix Signed-off-by: raver119 <raver119@gmail.com> * few minor prelu/bp tweaks Signed-off-by: raver119 <raver119@gmail.com> * nullify-related fixes Signed-off-by: raver119 <raver119@gmail.com> * PReLU fixes (#157) Signed-off-by: Alex Black <blacka101@gmail.com> * Build fixed * Fix tests * one more ByteBuffer signature restored Signed-off-by: raver119 <raver119@gmail.com> * nd4j-jdbc-hsql profiles fix Signed-off-by: raver119 <raver119@gmail.com> * nd4j-jdbc-hsql profiles fix Signed-off-by: raver119 <raver119@gmail.com> * PReLU weight init fix Signed-off-by: Alex Black <blacka101@gmail.com> * Small PReLU fix Signed-off-by: Alex Black <blacka101@gmail.com> * - INDArray.migrate() reactivated - DataBuffer::setDeviceId(...) added - InteropDataBuffer Z syncToDevice added for views Signed-off-by: raver119 <raver119@gmail.com> * missed file Signed-off-by: raver119 <raver119@gmail.com> * Small tweak Signed-off-by: Alex Black <blacka101@gmail.com> * cuda 10.2 Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> Co-authored-by: shugeo <sgazeos@gmail.com> Co-authored-by: Alex Black <blacka101@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-01-04 11:27:50 +01:00
#include <vector>
#include <array/NDArray.h>
#include <helpers/unicode.h>
2019-06-06 14:21:15 +02:00
namespace sd {
class ND4J_EXPORT StringUtils {
2019-06-06 14:21:15 +02:00
public:
template <typename T>
static FORCEINLINE std::string valueToString(T value) {
std::ostringstream os;
os << value ;
//convert the string stream into a string and return
return os.str();
}
/**
* These methods convert integer values to string with 0s and 1s
* @param value
* @return
*/
template <typename T>
static std::string bitsToString(T value);
2019-06-06 14:21:15 +02:00
/**
* This method just concatenates error message with a given graphId
* @param message
* @param graphId
* @return
*/
static FORCEINLINE std::string buildGraphErrorMessage(const char *message, Nd4jLong graphId) {
std::string result(message);
result += " [";
result += valueToString<Nd4jLong>(graphId);
result += "]";
return result;
}
String changes (#3) * initial commit * additional data types & tensor type Signed-off-by: raver119 <raver119@gmail.com> * next step Signed-off-by: raver119 <raver119@gmail.com> * missing include * sparse_to_dense Signed-off-by: raver119 <raver119@gmail.com> * few more tests files Signed-off-by: raver119 <raver119@gmail.com> * draft Signed-off-by: raver119 <raver119@gmail.com> * numeric sparse_to_dense Signed-off-by: raver119 <raver119@gmail.com> * comment Signed-off-by: raver119 <raver119@gmail.com> * string sparse_to_dense version Signed-off-by: raver119 <raver119@gmail.com> * CUDA DataBuffer expand Signed-off-by: raver119 <raver119@gmail.com> * few tweaks for CUDA build Signed-off-by: raver119 <raver119@gmail.com> * shape fn for string_split Signed-off-by: raver119 <raver119@gmail.com> * one more comment Signed-off-by: raver119 <raver119@gmail.com> * string_split indices Signed-off-by: raver119 <raver119@gmail.com> * next step Signed-off-by: raver119 <raver119@gmail.com> * test passes Signed-off-by: raver119 <raver119@gmail.com> * few rearrangements for databuffer implementations Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer: move inline methods to common implementations Signed-off-by: raver119 <raver119@gmail.com> * add native DataBuffer to Nd4j presets Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer creation Signed-off-by: raver119 <raver119@gmail.com> * use DataBuffer for allocation Signed-off-by: raver119 <raver119@gmail.com> * cpu databuffer as deallocatable Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer setters for bufers Signed-off-by: raver119 <raver119@gmail.com> * couple of wrappers Signed-off-by: raver119 <raver119@gmail.com> * DataBuffers being passed around Signed-off-by: raver119 <raver119@gmail.com> * Bunch of ByteBuffer-related signatures gone Signed-off-by: raver119 <raver119@gmail.com> * - few more Nd4j signatures removed - minor fix for bfloat16 Signed-off-by: raver119 <raver119@gmail.com> * nullptr pointer is still a pointer, but 0 as address :) Signed-off-by: raver119 <raver119@gmail.com> * one special test Signed-off-by: raver119 <raver119@gmail.com> * empty string array init Signed-off-by: raver119 <raver119@gmail.com> * one more test in cpp Signed-off-by: raver119 <raver119@gmail.com> * memcpy instead of databuffer swap Signed-off-by: raver119 <raver119@gmail.com> * special InteropDataBuffer for front-end languages Signed-off-by: raver119 <raver119@gmail.com> * few tweaks for java Signed-off-by: raver119 <raver119@gmail.com> * pointer/indexer actualization Signed-off-by: raver119 <raver119@gmail.com> * CustomOp returns list for inputArumgents and outputArguments instead of array Signed-off-by: raver119 <raver119@gmail.com> * redundant call Signed-off-by: raver119 <raver119@gmail.com> * print_variable op Signed-off-by: raver119 <raver119@gmail.com> * - view handling (but wrong one) - print_variable java wrapper Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * - empty arrays handling Signed-off-by: raver119 <raver119@gmail.com> * - deserialization works now Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> * meh Signed-off-by: raver119 <raver119@gmail.com> * one more fix Signed-off-by: raver119 <raver119@gmail.com> * initial cuda commit Signed-off-by: raver119 <raver119@gmail.com> * print_variable message validation Signed-off-by: raver119 <raver119@gmail.com> * CUDA views Signed-off-by: raver119 <raver119@gmail.com> * CUDA special buffer size Signed-off-by: raver119 <raver119@gmail.com> * minor update to match master changes Signed-off-by: raver119 <raver119@gmail.com> * - consider arrays always actual on device for CUDA - additional PrintVariable constructor - CudaUtf8Buffer now allocates host buffer by default Signed-off-by: raver119 <raver119@gmail.com> * meh Signed-off-by: raver119 <raver119@gmail.com> * - print_variable now allows print from device Signed-off-by: raver119 <raver119@gmail.com> * InteropDataBuffer data type fix Signed-off-by: raver119 <raver119@gmail.com> * ... Signed-off-by: raver119 <raver119@gmail.com> * disable some debug messages Signed-off-by: raver119 <raver119@gmail.com> * master pulled in Signed-off-by: raver119 <raver119@gmail.com> * couple of new methods for DataBuffer interop Signed-off-by: raver119 <raver119@gmail.com> * java side Signed-off-by: raver119 <raver119@gmail.com> * offsetted constructor Signed-off-by: raver119 <raver119@gmail.com> * new CUDA deallocator Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart 2 Signed-off-by: raver119 <raver119@gmail.com> * CUDA backend torn apart 3 Signed-off-by: raver119 <raver119@gmail.com> * - few new tests - few new methods for DataBuffer management Signed-off-by: raver119 <raver119@gmail.com> * few more tests + few more tweaks Signed-off-by: raver119 <raver119@gmail.com> * two failing tests Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * two failing tests pass Signed-off-by: raver119 <raver119@gmail.com> * now we pass DataBuffer to legacy ops too Signed-off-by: raver119 <raver119@gmail.com> * Native DataBuffer for legacy ops, Java side Signed-off-by: raver119 <raver119@gmail.com> * CPU java side update Signed-off-by: raver119 <raver119@gmail.com> * CUDA java side update Signed-off-by: raver119 <raver119@gmail.com> * no more prepare/register action on java side Signed-off-by: raver119 <raver119@gmail.com> * NDArray::prepare/register use now accepts vectors Signed-off-by: raver119 <raver119@gmail.com> * InteropDataBuffer now has few more convenience methods Signed-off-by: raver119 <raver119@gmail.com> * java bindings update Signed-off-by: raver119 <raver119@gmail.com> * tick device in NativeOps Signed-off-by: raver119 <raver119@gmail.com> * Corrected usage of OpaqueBuffer for tests. * Corrected usage of OpaqueBuffer for java tests. * NativeOpsTests fixes. * print_variable now returns scalar Signed-off-by: raver119 <raver119@gmail.com> * one more test Signed-off-by: raver119 <raver119@gmail.com> * compat_string_split fix for CUDA Signed-off-by: raver119 <raver119@gmail.com> * - CUDA execScalar fix - CUDA lazyAllocateHostPointer now checks java indexer/pointer instead of native pointer Signed-off-by: raver119 <raver119@gmail.com> * legacy ops DataBuffer migration prototype Signed-off-by: raver119 <raver119@gmail.com> * ignore device shapeinfo coming from java Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> * minor transformAny fix Signed-off-by: raver119 <raver119@gmail.com> * minor tweak for lazy host allocation Signed-off-by: raver119 <raver119@gmail.com> * - DataBuffer::memcpy method - bitcast now uses memcpy Signed-off-by: raver119 <raver119@gmail.com> * - IndexReduce CUDA dimension buffer fix Signed-off-by: raver119 <raver119@gmail.com> * views for CPU and CUDA Signed-off-by: raver119 <raver119@gmail.com> * less spam Signed-off-by: raver119 <raver119@gmail.com> * optional memory init Signed-off-by: raver119 <raver119@gmail.com> * async memset Signed-off-by: raver119 <raver119@gmail.com> * - SummaryStats CUDA fix - DataBuffer.sameUnderlyingData() impl - execBroadcast fix Signed-off-by: raver119 <raver119@gmail.com> * - reduce3All fix switch to CUDA 10 temporarily Signed-off-by: raver119 <raver119@gmail.com> * CUDA version Signed-off-by: raver119 <raver119@gmail.com> * proper memory deallocator registration Signed-off-by: raver119 <raver119@gmail.com> * HOST_ONLY workspace allocation Signed-off-by: raver119 <raver119@gmail.com> * temp commit Signed-off-by: raver119 <raver119@gmail.com> * few conflicts resolved Signed-off-by: raver119 <raver119@gmail.com> * few minor fixes Signed-off-by: raver119 <raver119@gmail.com> * one more minor fix Signed-off-by: raver119 <raver119@gmail.com> * NDArray permute should operate on JVM primitives Signed-off-by: raver119 <raver119@gmail.com> * - create InteropDataBuffer for shapes as well - update pointers after view creation in Java Signed-off-by: raver119 <raver119@gmail.com> * - addressPointer temporary moved to C++ Signed-off-by: raver119 <raver119@gmail.com> * CUDA: don't account offset twice Signed-off-by: raver119 <raver119@gmail.com> * CUDA: DataBuffer pointer constructor updated Signed-off-by: raver119 <raver119@gmail.com> * CUDA NDArray.unsafeDuplication() simplified Signed-off-by: raver119 <raver119@gmail.com> * CUDA minor workspace-related fixes Signed-off-by: raver119 <raver119@gmail.com> * CPU DataBuffer.reallocate() Signed-off-by: raver119 <raver119@gmail.com> * print_affinity op Signed-off-by: raver119 <raver119@gmail.com> * print_affinity java side Signed-off-by: raver119 <raver119@gmail.com> * CUDA more tweaks for data locality Signed-off-by: raver119 <raver119@gmail.com> * - compat_string_split tweak - CudaUtf8Buffer update Signed-off-by: raver119 <raver119@gmail.com> * INDArray.close() mechanic restored Signed-off-by: raver119 <raver119@gmail.com> * one more test fixed Signed-off-by: raver119 <raver119@gmail.com> * - CUDA DataBuffer.reallocate() updated - cudaMemcpy (synchronous) restored Signed-off-by: raver119 <raver119@gmail.com> * one last fix Signed-off-by: raver119 <raver119@gmail.com> * bad import removed Signed-off-by: raver119 <raver119@gmail.com> * another small fix Signed-off-by: raver119 <raver119@gmail.com> * one special test Signed-off-by: raver119 <raver119@gmail.com> * fix bad databuffer size Signed-off-by: raver119 <raver119@gmail.com> * release primaryBuffer on replace Signed-off-by: raver119 <raver119@gmail.com> * higher timeout Signed-off-by: raver119 <raver119@gmail.com> * disable timeouts Signed-off-by: raver119 <raver119@gmail.com> * dbCreateView now validates offset and length of a view Signed-off-by: raver119 <raver119@gmail.com> * additional validation for dbExpand Signed-off-by: raver119 <raver119@gmail.com> * restore timeout back again Signed-off-by: raver119 <raver119@gmail.com> * smaller distribution for rng test to prevent timeouts Signed-off-by: raver119 <raver119@gmail.com> * CUDA DataBuffer::memcpy now copies to device all the time Signed-off-by: raver119 <raver119@gmail.com> * OpaqueDataBuffer now contains all required methods for interop Signed-off-by: raver119 <raver119@gmail.com> * some javadoc Signed-off-by: raver119 <raver119@gmail.com> * GC on failed allocations Signed-off-by: raver119 <raver119@gmail.com> * minoe memcpu tweak Signed-off-by: raver119 <raver119@gmail.com> * one more bitcast test Signed-off-by: raver119 <raver119@gmail.com> * - NDArray::deviceId() propagation - special multi-threaded test for data locality checks Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer additional syncStream Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer additional syncStream Signed-off-by: raver119 <raver119@gmail.com> * one ignored test Signed-off-by: raver119 <raver119@gmail.com> * skip host alloc for empty arrays Signed-off-by: raver119 <raver119@gmail.com> * ByteBuffer support is back Signed-off-by: raver119 <raver119@gmail.com> * DataBuffer::memcpy minor fix Signed-off-by: raver119 <raver119@gmail.com> * few minor prelu/bp tweaks Signed-off-by: raver119 <raver119@gmail.com> * nullify-related fixes Signed-off-by: raver119 <raver119@gmail.com> * PReLU fixes (#157) Signed-off-by: Alex Black <blacka101@gmail.com> * Build fixed * Fix tests * one more ByteBuffer signature restored Signed-off-by: raver119 <raver119@gmail.com> * nd4j-jdbc-hsql profiles fix Signed-off-by: raver119 <raver119@gmail.com> * nd4j-jdbc-hsql profiles fix Signed-off-by: raver119 <raver119@gmail.com> * PReLU weight init fix Signed-off-by: Alex Black <blacka101@gmail.com> * Small PReLU fix Signed-off-by: Alex Black <blacka101@gmail.com> * - INDArray.migrate() reactivated - DataBuffer::setDeviceId(...) added - InteropDataBuffer Z syncToDevice added for views Signed-off-by: raver119 <raver119@gmail.com> * missed file Signed-off-by: raver119 <raver119@gmail.com> * Small tweak Signed-off-by: Alex Black <blacka101@gmail.com> * cuda 10.2 Signed-off-by: raver119 <raver119@gmail.com> * minor fix Signed-off-by: raver119 <raver119@gmail.com> Co-authored-by: shugeo <sgazeos@gmail.com> Co-authored-by: Alex Black <blacka101@gmail.com> Co-authored-by: Alexander Stoyakin <alexander.stoyakin@gmail.com>
2020-01-04 11:27:50 +01:00
/**
* This method returns number of needle matches within haystack
* PLEASE NOTE: this method operates on 8-bit arrays interpreted as uint8
*
* @param haystack
* @param haystackLength
* @param needle
* @param needleLength
* @return
*/
static uint64_t countSubarrays(const void *haystack, uint64_t haystackLength, const void *needle, uint64_t needleLength);
/**
* This method returns number of bytes used for string NDArrays content
* PLEASE NOTE: this doesn't include header
*
* @param array
* @return
*/
static uint64_t byteLength(const NDArray &array);
/**
* This method splits a string into substring by delimiter
*
* @param haystack
* @param delimiter
* @return
*/
static std::vector<std::string> split(const std::string &haystack, const std::string &delimiter);
Oleh convert (#200) * StringUtils for utf convertor raw implementation of all possible combinations, need to be add counter of bytes per symbol for any type and add api to call convertors and store data Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor more corrections to support convertors Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor some corrections and bug fixes, need review to discuss how to add multi-threading Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some corrections to move to multi-threading, add one test need discussion data inputs/outputs array presentation, need discussion the way of multi-threading * StringUtils for utf convertor #8613 tests added some corrections to optimize build Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some corrections and code clean up Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 code clean up and optimize usage, need update ndarray factory before replace std usage Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some staff to integrate converters into NDArrayFactory, update tests and add some functionality Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 minor corrections and bug fix before discussion * StringUtils for utf convertor #8613 some fixes and tets * StringUtils for utf convertor #8613 some more staff to support different unicode Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fix linking bug * StringUtils for utf convertor #8613 corrected several tests as defaults for string ndarray changed * StringUtils for utf convertor #8613 replace some incorrect implementation, revert some test changes, need sync before testing * StringUtils for utf convertor #8613 fixed several thing that were badly implemented yesterday, need optimization, testing (before testing have to be add support of u32 and u16 buffer visualization) * StringUtils for utf convertor #8613 fixed to support u16 and u32, and convertor in ndarray, fix buffer print, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 merge master and sync with server Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 some correction for string cast, need print check only asci support Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 merge master, remove copies and add cast, need test, refactoring according review and clean up * StringUtils for utf convertor #8613 fixed cast and copy issues Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed cuda and update tests * StringUtils for utf convertor #8613 integration into NdArray, fix several tests for build pass, refactoring, etc * - avoid ambiguity of NDArray ctrs overloading in some tests Signed-off-by: Yurii <iuriish@yahoo.com> * StringUtils for utf convertor #8613 NDArray string constructors added, updated NDArrayFactory, refactoring unicode and tests, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed cuda build and test, refactoring and void* added to some functions Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 void* integration, removed copy operation, refactoring, added tests for NDArray string constructors, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 several more fixes, improvements and updates Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 master merge, code clean up and optimization before review Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 minor fixes string element size define Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 revert last changes as mistake Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 fixed NDArray constructor build problem, remove order from string factory, fixed order use for factory via project, added catch of incorrect sync in cast of arrays to data types, fixed e method for strings, etc Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 added javacpp hack, added multi-threading, minor corrections in license agreement Signed-off-by: Oleg <oleg.semeniv@gmail.com> * StringUtils for utf convertor #8613 windows builds fix, as "sting" is not treated as utf8 Signed-off-by: Oleg <oleg.semeniv@gmail.com> Co-authored-by: Yurii Shyrma <iuriish@yahoo.com>
2020-01-31 14:30:49 +01:00
/**
* This method convert u8 string to u16
* @param const reference to input string
* @param reference to output u16string
* @return boolean status
*/
static bool u8StringToU16String(const std::string& u8, std::u16string& u16);
/**
* This method convert u8 string to u32
* @param const reference to input string
* @param reference to output u32string
* @return boolean status
*/
static bool u8StringToU32String(const std::string& u8, std::u32string& u32);
/**
* This method convert u16 string to u32
* @param const reference to input u16string
* @param reference to output u32string
* @return boolean status
*/
static bool u16StringToU32String(const std::u16string& u16, std::u32string& u32);
/**
* This method convert u16 string to u8 string
* @param const reference to input u16string
* @param reference to output string
* @return boolean status
*/
static bool u16StringToU8String(const std::u16string& u16, std::string& u8);
/**
* This method convert u32 string to u16 string
* @param const reference to input u32string
* @param reference to output u16string
* @return boolean status
*/
static bool u32StringToU16String(const std::u32string& u32, std::u16string& u16);
/**
* This method convert u32 string to u8 string
* @param const reference to input u32string
* @param reference to output string
* @return boolean status
*/
static bool u32StringToU8String(const std::u32string& u32, std::string& u8);
template <typename T>
static std::string vectorToString(const std::vector<T> &vec);
2019-06-06 14:21:15 +02:00
};
}
#endif //LIBND4J_STRINGUTILS_H