在Oracle中计算设置位的最佳方法

xa9qqrwz  于 2023-10-16  发布在  Oracle
关注(0)|答案(3)|浏览(121)

为了为Oracle SQL中的一个非常特殊的用例节省保存空间,我正在尝试使用位图方法来表示一段时间内的布尔状态(当天发生或未发生的事件),二进制数中的每个位表示当天的是/否。这样,例如,一个32位的数字将允许我表示连续32天每天是否发生了一些事情。我只需要计算设置(=1)的位数,就可以得到事件发生的32天内的天数,而不必为每一天存储单独的日期行。
下面是我每天测试更新值的一个例子(滚动最旧的位并设置最新的位):

SELECT testbitmap original_bitmap,
       BITAND((testbitmap * POWER(2,/*rolloff the nbr of days since last event*/1)) + /*the event happened today*/1,POWER(2,32)-1) new_bitmap
  FROM (SELECT BIN_TO_NUM(1,1,0,0,1,1,0,0,0,1,1,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0) testbitmap
          FROM dual)

到目前为止还不错。但是现在我需要查询结果,这意味着计算结果位图的设置位数。据我所知,在Oracle中没有BIN_TO_NUM的逆。如果不使用循环遍历每个位并单独测试的函数,是否有方法在Oracle中计算设置位(例如,数字9的结果应该是2(1001),而数字7的结果应该是3(0111)。也许是一个数学公式,它会返回表示二进制数字所需的1的个数?

brc7rcf0

brc7rcf01#

你可以使用Hamming Weight algorithm,在C++中它是:

int numberOfSetBits(uint32_t i)
{
     // Java: use int, and use >>> instead of >>. Or use Integer.bitCount()
     // C or C++: use uint32_t
     i = i - ((i >> 1) & 0x55555555);        // add pairs of bits
     i = (i & 0x33333333) + ((i >> 2) & 0x33333333);  // quads
     i = (i + (i >> 4)) & 0x0F0F0F0F;        // groups of 8
     return (i * 0x01010101) >> 24;          // horizontal sum of bytes
}

在Oracle中,您可以使用以下函数:

CREATE FUNCTION number_of_set_bits32(
  i NUMBER
) RETURN NUMBER DETERMINISTIC
IS
  v NUMBER := i;
  c_max CONSTANT NUMBER := POWER(2, 32);
BEGIN
  v := v - BITAND(TRUNC(v/2), TO_NUMBER('55555555', 'XXXXXXXX'));
  v := MOD(
         BITAND(v, TO_NUMBER('33333333', 'XXXXXXXX'))
         + BITAND(TRUNC(v/4), TO_NUMBER('33333333', 'XXXXXXXX')),
         c_max
       );
  v := BITAND(v + FLOOR(v/16), TO_NUMBER('0F0F0F0F', 'XXXXXXXX'));
  RETURN TRUNC(MOD(v * TO_NUMBER('01010101', 'XXXXXXXX'), c_max) / POWER(2, 24));
END;
/

那么计算你的价值将是:

SELECT testbitmap AS original_bitmap,
       NUMBER_OF_SET_BITS32(testbitmap)
FROM   (
  SELECT BIN_TO_NUM(1,1,0,0,1,1,0,0,0,1,1,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0)
           AS testbitmap
  FROM   DUAL
)

其输出:
| ORIGINAL_BITMAP| NUMBER_OF_SET_BITS32(测试位图)|
| --|--|
| 3429605376 | 11 |
你也应该能够使用Java创建一个Oracle函数:

CREATE OR REPLACE FUNCTION number_of_set_bits(i NUMBER) RETURN NUMBER DETERMINISTIC
  AS LANGUAGE JAVA NAME 'java.lang.Long.bitCount( long ) return long';
/

对于相同的SELECT查询,它给出相同的输出。

64位和128位数字:

如果你想要一个64位(或128位)的函数,那么将所有十六进制字符串的宽度加倍(四倍),并将最后的除法从224改为256(2120),以获得最高的8位,正如在这个答案顶部链接的答案中提到的那样:

CREATE FUNCTION number_of_set_bits64(
  i NUMBER
) RETURN NUMBER DETERMINISTIC
IS
  v NUMBER := i;
  c_max CONSTANT NUMBER := POWER(2, 64);
BEGIN
  v := v - BITAND(TRUNC(v/2), TO_NUMBER('5555555555555555', 'XXXXXXXXXXXXXXXX'));
  v := MOD(
         BITAND(v, TO_NUMBER('3333333333333333', 'XXXXXXXXXXXXXXXX'))
         + BITAND(TRUNC(v/4), TO_NUMBER('3333333333333333', 'XXXXXXXXXXXXXXXX')),
         c_max
       );
  v := BITAND(v + FLOOR(v/16), TO_NUMBER('0F0F0F0F0F0F0F0F', 'XXXXXXXXXXXXXXXX'));
  RETURN TRUNC(
    MOD(v * TO_NUMBER('0101010101010101', 'XXXXXXXXXXXXXXXX'), c_max)
    / POWER(2, 56)
  );
END;
/
CREATE FUNCTION number_of_set_bits128(
  i NUMBER
) RETURN NUMBER DETERMINISTIC
IS
  v NUMBER := i;
  c_max CONSTANT NUMBER := POWER(2, 128);
BEGIN
  v := v - BITAND(TRUNC(v/2), TO_NUMBER('55555555555555555555555555555555', 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'));
  v := MOD(
         BITAND(v, TO_NUMBER('33333333333333333333333333333333', 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'))
         + BITAND(TRUNC(v/4), TO_NUMBER('33333333333333333333333333333333', 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX')),
         c_max
       );
  v := BITAND(v + FLOOR(v/16), TO_NUMBER('0F0F0F0F0F0F0F0F0F0F0F0F0F0F0F0F', 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'));
  RETURN TRUNC(
    MOD(v * TO_NUMBER('01010101010101010101010101010101', 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX'), c_max)
    / POWER(2, 120)
  );
END;
/

或者,在Java中,对于更大的数字,可以使用java.math.BigInteger.bitCount()

CREATE AND COMPILE JAVA SOURCE NAMED bitutils AS
import java.math.BigDecimal;

public class BitUtils {
  public static Integer bitCount(
    final BigDecimal value
  )
  {
    return value == null ? null : value.toBigInteger().bitCount();
  }
};
/

CREATE OR REPLACE FUNCTION number_of_set_bits_java(i NUMBER) RETURN NUMBER DETERMINISTIC
  AS LANGUAGE JAVA NAME 'BitUtils.bitCount( java.math.BigDecimal ) return java.lang.Integer';
/

fiddle

cwdobuhd

cwdobuhd2#

我不像@MT0那么聪明,我会使用这种方法。首先将数字转换为二进制:

CREATE FUNCTION Dec2Base(DecN IN NUMBER, Base IN INTEGER DEFAULT 16) RETURN VARCHAR2 DETERMINISTIC IS
    
    n NUMBER;
    BaseString VARCHAR2(129) := NULL;
    DecNumber NUMBER := DecN;
    HexString CONSTANT CHAR(16) := '0123456789ABCDEF';
    
BEGIN
    IF DecN IS NULL THEN
        RETURN NULL;
    ELSIF Base > 16 THEN 
        RAISE VALUE_ERROR;
    ELSIF base = 16 THEN
        RETURN TO_CHAR(DecN, 'fmXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX');
    ELSE
        IF DecN > 2**127 THEN
            -- "MOD(DecNumber, Base)" returns wrong result for larger numbers
            RAISE NUMERIC_OVERFLOW;
        END IF;         
        LOOP
            BaseString := SUBSTR(HexString, MOD(DecNumber, Base) + 1, 1 ) || BaseString;
            DecNumber := TRUNC(DecNumber / Base);
            EXIT WHEN DecNumber = 0;
        END LOOP;
        RETURN BaseString;
    END IF;
    
END Dec2Base;

然后你可以用REGEXP_COUNT来计算1

REGEXP_COUNT(Dec2Base(BIN_TO_NUM(1,1,0,0,1,1,0,0,0,1,1,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0), 2), '1')

我认为汉明权重算法会更有效。您可以考虑将值存储为RAW,而不是整数值。UTL_RAW软件包提供了各种功能,应该可以满足您的需求。

dsf9zpds

dsf9zpds3#

这是另一个想法,从计数的数量设置位在一个32位整数这是很容易理解的。用PL/SQL写的,它很简单:

CREATE OR REPLACE FUNCTION f_count_setbits (in_number IN number)
    RETURN number DETERMINISTIC
  AS
    var_count pls_integer := 0;
    var_number number := in_number;
  BEGIN
    WHILE var_number > 0
    LOOP
      var_count := var_count + 1;
      var_number := BITAND(var_number, var_number - 1);
    END LOOP;
    
    RETURN var_count;
  END;

测试输出:

SELECT test_number,
       f_count_setbits(test_number) nbr_of_set_bits
  FROM (SELECT LEVEL test_number
          FROM dual
        CONNECT BY LEVEL < 16)

1   1
2   1
3   2
4   1
5   2
6   2
7   3
8   1
9   2
10  2
11  3
12  2
13  3
14  3
15  4

SELECT /* 32 bit */ f_count_setbits(BIN_TO_NUM(1,1,0,0,1,1,0,0,0,1,1,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0)) FROM dual

11

SELECT /* 127 bit */ f_count_setbits(BIN_TO_NUM(1,1,0,0,1,1,0,0,0,1,1,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,1,1,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,1,1,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,1,1,0,1,0,1,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0)) FROM dual

44

虽然这确实会循环,但它不会对每一位都循环...仅针对每个设置位(1)。因此,一个具有稀疏位(几个1)的数字将循环很少的次数。这使得它很适合用于跟踪稀疏事件。
然而,在大数字的实际实现中,特别是在MT 0增强了Hamming Weight算法以支持128位数字之后,我发现Hamming Weight方法仍然明显更快。

相关问题