# [Programming Pearls] Column 1 – bitvec

Use case of a bitvec:
v = bitvec(capacity)
v.set(45322)
v.set(1000234)

if v.has(453222): print(‘ok it’s there’)

Data structure:
bytearray

Key concepts on storage:
- 1M bytes == 8M bits, meaning bitvec on 1MB memory will be able to handle 8M numbers, with the max number to be 8M – 1 (assuming we need to store 0 in the bitvec)
- 10M numbers requests 10/8M bytes – Assuming ‘M’ is always 2^20, not necessarily 1,000,000.

Key implementation problems:
In the case of v.set(num),
- What is the corresponding position (naming it as pos) for num in the underlying bytearray?
pos = num / 8 (or num >> 3)
- What is the corresponding mask index (naming it as r, remainder) in bytearray[pos]?
r = num % 8 (or num & 0×7)

The implementation of bit vector:

```class prog_1_2_bitvec:
"""
>>> v = prog_1_2_bitvec(6)
>>> v.set(2)
>>> v.set(5)
>>> print('%x' % v.bytes[0]) # should be 100100
24
>>> v.has(2)
True
>>> v.has(5)
True
>>> v.unset(2)
>>> v.has(2)
False
>>> v.unset(5)
>>> v.has(5)
False
>>> v.set(3)
>>> v.has(3)
True
"""

def __init__(self, max):
if max <= 0: raise Exception('invalid max - length <= 0')
self.bytes = bytearray(int(max / 8 + 1))
self.max = max

def set(self, num):
self._valid(num)

def has(self, num):
self._valid(num)

def unset(self, num):
self._valid(num)

def count(self):
# lie: TODO: best way to count the number of 1
# will cover it later
pass

def _divide(self, num):
d = num >> 3
r = num & 0x7
return (d, 1 << r)

def _valid(self, num):
if num < 0: raise Exception('invalid num - negative number')
if num >= self.max: raise Exception('invalid num - num > %d' % self.max)
```

The implementation of multi-pass bitvec sort:

```def prog_1_3_bitsort_multipass(l, max, k):
"""
TODO: margin conditions...

>>> max = 1 << 8
>>> l = prog_1_4_gen_list(max, max >> 1)
>>> print(l[0:20])  # doctest:+ELLIPSIS
[...]
>>> l = prog_1_3_bitsort_multipass(l, max, 2)
>>> print(l[0:20])  # doctest:+ELLIPSIS
[...]
>>> test_verify_sorted(l)
True
"""
def sort_singlepass(l, max, i, k):
# assumptions:
# 1. let slice = max / k, the func is to handle numbers from range i * slice ~ (i + 1) * slice
# 2. The mapping relationship is n => n - i * slice
# exception cases (needs assertion):
# 1. max / k has remainder
slice = int(max / k)
upper = (i + 1) * slice
lower = i * slice
v = prog_1_2_bitvec(slice)
for n in l:
if n < lower or n >= upper: continue
mapped_n = n - lower
v.set(mapped_n)
sortedlist = [n + i * slice for n in range(slice) if v.has(n)]
return sortedlist

sortedlist = []
for i in range(0, k):
sortedlist += sort_singlepass(l, max, i, k)
return sortedlist
```

Hints:
- Overall: define use case – UT, method signature, etc
- Write logic: write mainstraem logic with notes (in comments) or assertions for exception / corner cases.
- Organize mind: cases!!

Tests (TODO):
- Areas: UT – like UT for sort_singlepass, basic E2E (positive), negative like to pass numbers less than 0 or exceeding data volume, code coverage (previous test cases may have covered most), performance

# 安全，安全…2…强命名程序集

…这种做法，漏洞很大。在上一篇blog中提到过，利用消息签名防篡改有个重要的前提：“消息接收者都能拿到消息发送者真正的公钥，设想假如某些消息接收者拿到了假的公钥…那么拥有对应假私钥的篡改者就能任意篡改数据了”。.NET的做法，将公钥与需要防篡改的程序集一起发布，那么假如篡改者拿到这个程序集，改掉部分代码，然后用自己的私钥生成签名，和自己的公钥一起替换掉原来程序集文件里的签名和公钥部分。那么CLR是检测不出来的。更简单一点，篡改者改掉代码，然后直接抹去原来程序集文件里的签名和公钥即可。

# 安全，安全…1…公钥私钥

### 杂项&参考

RSA算法：一种公开密钥加密算法

SHA算法：一种不可逆的生成报文摘要的哈希算法。

MD5算法：一种不可逆的生成报文摘要的哈希算法。