raver119 6de00bf75f
[WIP] Weekly update of repo (#8390)
* [WIP] Fix compilation after nd4j changes (#37)

* Fix compilation.

* Some tests fixed

* Disable tests temporarily.

* Restored test

* Tests restored.

* Test restored.

* [WIP] perf tests (#40)

* special maxpool test

Signed-off-by: raver119 <raver119@gmail.com>

* special maxpool test

Signed-off-by: raver119 <raver119@gmail.com>

* Shyrma bnorm bp (#41)

Batchnorm backprop mkldnn

* Add SameDiff memory reuse memory manager (array cache) (#39)

* Attention op comments

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* ArrayCacheMemoryMgr - first pass

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Tweak array cache for use with SameDiff identity arrays

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* ArrayCacheMemoryMgr javadoc and properly get max memory

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* LRU cache policy + add tests

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Fixes

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Resize arrays internally if required for ArrayCacheMemoryMgr

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Test improvement

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* Small polish

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* SameDiff op runtime benchmarking listener (#42)

Signed-off-by: AlexDBlack <blacka101@gmail.com>

* INLINE_LOOPS for windows

Signed-off-by: raver119 <raver119@gmail.com>

* [WIP] ThreadPool (#8)

This PR removes OpenMP use in 95% of cases
2019-11-13 17:15:18 +03:00

76 lines
2.8 KiB
C++

/*******************************************************************************
* Copyright (c) 2015-2018 Skymind, Inc.
*
* This program and the accompanying materials are made available under the
* terms of the Apache License, Version 2.0 which is available at
* https://www.apache.org/licenses/LICENSE-2.0.
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
* License for the specific language governing permissions and limitations
* under the License.
*
* SPDX-License-Identifier: Apache-2.0
******************************************************************************/
//
// @author Yurii Shyrma (iuriish@yahoo.com)
//
#include "ResultSet.h"
#include <ops/declarable/helpers/matrixSetDiag.h>
#include <execution/Threads.h>
namespace nd4j {
namespace ops {
namespace helpers {
//////////////////////////////////////////////////////////////////////////
template<typename T>
void matrixSetDiag_(const NDArray& input, const NDArray& diagonal, NDArray& output, const bool zeroPad) {
// input and output are the same array (x == z) when zeroPad = true
// xRank = zRank, xRank = yRank + 1
// xLen = zLen
const T* x = input.bufferAsT<T>();
const T* y = diagonal.bufferAsT<T>();
T* z = output.bufferAsT<T>();
const Nd4jLong* xShapeInfo = input.getShapeInfo();
const Nd4jLong* yShapeInfo = diagonal.getShapeInfo();
const Nd4jLong* zShapeInfo = output.getShapeInfo();
const bool areSameOffsets = shape::haveSameShapeAndStrides(xShapeInfo, zShapeInfo); // shapes are definitely the same, but strides might not
const int xRank = input.rankOf();
const auto xLen = input.lengthOf();
auto func = PRAGMA_THREADS_FOR {
Nd4jLong coords[MAX_RANK];
for (Nd4jLong i = 0; i < xLen; ++i) {
shape::index2coords(i, xShapeInfo, coords);
const auto xOffset = shape::getOffset(xShapeInfo, coords);
const auto zOffset = areSameOffsets ? xOffset : shape::getOffset(zShapeInfo, coords);
// condition to be on diagonal of innermost matrix
if (coords[xRank - 2] == coords[xRank - 1])
z[zOffset] = y[shape::getOffset(yShapeInfo, coords)];
else
z[zOffset] = zeroPad ? static_cast<T>(0) : x[xOffset];
}
};
samediff::Threads::parallel_for(func, 0, xLen);
}
//////////////////////////////////////////////////////////////////////////
void matrixSetDiag(nd4j::LaunchContext* context, const NDArray& input, const NDArray& diagonal, NDArray& output, const bool zeroPad) {
BUILD_SINGLE_SELECTOR(input.dataType(), matrixSetDiag_, (input, diagonal, output, zeroPad), LIBND4J_TYPES);
}
}
}
}