首页 > 技术文章 > MiniSQL设计 - Index Manager模块

BirdCage 2017-07-16 16:03 原文

    依然是,按照模板完成的一块,而且可能完成度不太高。

    在这个过程中虽然自己有调试树,但是最后验收的时候数据量比较小,而buffer和record的结合完成的比较晚,所以测试仅仅只限于小部分的输入。写树的部分的时候感觉应该用C++写的……测试过程中如果能用C++来完成初始化会好很多……

    发现二分查找很难(要不是一开始写了我就遍历了……)相当于写了三种不同的二分查找,找具体位置、找空格、找前包含的位置(查找路过中间节点),然后就……一直在调整这几个二分查找。

    还有就是自己(好像总感觉从cBlock里调一个值蛮快的没必要写函数balabala),在一个block里取具体值的那个公式(value.Mid(8+ (3 + type_size)* n, type_size)之类)复制黏贴了无数次……感觉自己缺少相应的规划,还有一个函数体结构一大自己的变量名就认不出来了……感觉都长得差不多……value、block、value_temp1,诸如此类……以后还是要养成好的习惯!~

    最后感慨一下这个存储方式的神奇orz我之前只能用指针去表示子节点……

然后,因为最后验收的时候没有压测所以在测试的时候只对比较简单的情况进行了处理……可能这棵树还有很多问题……


一.模块分析:

      Index manager是程序的索引部分,直接对buffer manager提供的内存索引块操作,主要是构建B+树,负责实现record manager这一模块(提供函数接口)所需要的函数。功能有:存储\删除记录的索引,查找记录在table表中的相对位置等(由buffer负责写回磁盘)。

二.设计思路:

      块的设计:把4kb的block作为B+树的节点,在每个block的开始用一个不同字符代替叶子块和中间块,同时紧跟block中的value的数目以及value值,同时对索引文件中的空块链成单链表,由索引文件信息结构体的freenum保存第一个空块的块号,其余每块头三个字节保存下一个空块的块号,最后一个保存字符’000’。详细图如下:

中间节点:

开始为?标记为叶子节点。

_ttoi(node_value.Mid(1, 4))为有几个节点

value.Mid(8 + (3 + type_size)* n, type_size)为第n个索引值

value.Mid(5 + (3 + type_size)* n, 3))为第n个叶子节点的编号

叶子节点:

开始为!标记为叶子节点。

_ttoi(node_value.Mid(1, 4))为有几个数据

value.Mid(10 + (5 + type_size)* n, type_size)为第n个数值

value.Mid(10 + (5 + type_size)* n, type_size)为第n个数值的offset.

最后三位为下一个叶子节点。如果是#表示该节点已为最后一个叶子节点。

 

空节点的存储方式同buffer。

三、函数介绍:

实现的所有接口函数如下:

1.查找一个值等于inform .value(inform为info结构体)的记录在table表中的记录号:

int search_one(CString database, CStringtable_name, struct index_info & inform);

参数说明:

database:数据库名,cstring型;

table_name:表格文件名,cstring型

inform:info结构体,是引用。

函数结束返回时,记录号保存在inform .offset中,若没有,inform.offset=0。

返回值:

   -7    inform .offset=0    索引文件为空

      -8    inform .offset=0    表中无记录

           -1    inform .offset=0    所查值的类型错误(int,float,char(n))

           -3    inform .offset=0    读到异常块

           正数  叶子块号,若inform .offset>0,记录存在表文件中,记录号为inform .offset

正数  叶子块号,若inform .offset=0,记录不在表文件中,返回的叶子块号在函数insert_one中要用(因为将值插入改块就够了)

思路:按照b+树的原理,遍历中间和叶子节点,直到找到其值,返回。

 

2.查找一批>,  >=, <, <=inform . value(inform为info结构体)的记录在table表中的记录号:

void search_many(CString database,CStringtable_name,int& start,int& end,int type,

                           structindex_info& inform) ;

参数说明:

database:数据库名,cstring型;

table_name:表格文件名,cstring型

inform:info结构体,是引用,

type: 1 是 找出所有 > inform .value; 的记录;

2 是 找出所有 >= inform .value 的记录;

           3 是 找出所有 < inform .value 的记录;

          4 是 找出所有 <= inform .value 的记录;

 

函数返回时:

若没有:       start=0;

否则:       start=索引文件中所需记录所在的叶子节点的起始块号

             end=索引文件中所需记录所在的叶子节点的终止块号

             inform . offset =索引文件start起始块中的第一个所需值的偏移量(>,>=)  

或则

                  =索引文件end 终止块中的最后一个所需值的偏移量(<,<=)

思路:分两类考虑,按照b+树的原理,遍历中间和叶子节点,直到找到起始值或终止值所在的块时,返回。

 

3.插入一个值的函数,值和类型在参数inform结构体中:

3.1.voidinsert_one(CString database, CString  table_name, struct index_info & inform);

先调用search_one函数,找到要插入的叶子块块号,并判断是否值已经存在,若不存在,当叶子未满时,则找到位子插入,否则,调用insert_divide函数分割,再插入。

 

3.2.到叶子块满时调用函数:

void insert_divide(CString database,CStringtable_name,struct index_info& inform,

                           int leaf1,int leaf2,char* leafpoint2);

在这里为了简化后续的插入操作,在插入一个数的时候,判断插入完成后是否可以再插入一个值。如果插入该节点后该块被填满,调用该函数分割。

重复调用这个函数直到将leafpoint2插入之后不满足分裂的条件。

在找到leaf1的父节点为根节点的时候需要特殊处理,将新申请的块和根节点的编号交换(要保证根节点的块编号是001,因为是通过这个方法寻找根节点的。)

 

参数说明:

leaf1:原来满的叶子块的块号,此块已经分割好

leaf2:新的块号,此块保存有原来满的叶子块leaf1的右半部分

leafpoint2:传递字符串,在分裂后两个节点,后一个子块的向上传递的字符。

 

4.CStringint_to_str(int value);

int 型数值转化成5字节的string类型。

 

5.intfind_prev_leaf_sibling(CString database,CString table_name, struct index_infoinform,

int nodenum);

找出所在叶子块的前一个兄弟叶子块,调用find_left_child函数,找出最左孩子,并判断是否与nodenum相等,不相等的话,遍历所有孩子,找到nodenum块时,返回他的前面节点,否则,返回0。

返回值:

 -1      函数中读到错误块;

0       没有前一个兄弟叶子块(即此块为最左孩子)

>0      前一个兄弟叶子块的块号

 

6.intfind_next_leaf_sibling(CString database,CString table_name,struct index_infoinform,

 int nodenum);

找出所在叶子块的后一个兄弟叶子块,读取nodenum对应叶子块块号,根据值(最后三位)返回叶子块。

返回值:

 -1      函数中读到错误块;

0       没有后一个兄弟叶子块(即此块为最右孩子)

>0      后一个兄弟叶子块的块号

 

7.intfind_left_child (CString database,index_info inform);

找出最左叶子块的块号

返回值:

 -1      函数中读到错误块;

0       索引文件空

>0      最左叶子块的块号

 

8.intfind_right_child(CString database,index_info inform);

找出最右叶子块的块号。

返回值:

 -1      函数中读到错误块;

0       索引文件空

>0      最右叶子块的块号

 

9.intget_new_freeblocknum(CString database,CString table_name,struct index_info&inform);

调用buffer的函数findBlock,返回块号。

 

10.intfind_father(CString database,CString table_name,index_info inform, int num);

根据B+树的原理,查找值为inform .value所在的叶子节点,若中途碰到块号为num的数据块,则返回他的父亲节点。

返回值    -1         函数中读到错误块;

0         没有父亲块(即此块为根)

>0         父亲块的块号

 

11.void delete_one(CString database,CStringtable_name,struct index_info & inform);

调用search_one找到inform对应的位置,把对应的offset值置为0.

 

12. int compare(CString x, CString y, structindex_info& inform);

根据inform.type对两个参数转换之后,对两个参数进行比较。

返回值:x>y返回1,x==y返回0,x<y返回-1。

 

13. int find_pos(CString value, CString des_value, inttype_size, index_info  inform);

二分查找,主要应用于查找。

对于叶节点,返回该叶节点的位置,不存在返回-1。

对于中间节点,返回该节点所在区间的位置。

 

14. int find_insert(CString value, CString des_value,int type_size, index_info  inform);

二分查找,返回合适的插入位置,主要是对于叶节点

返回值:       所在区间 1~n

异常返回 0

如果已经存在表中存在 -pos -1,所在区间-n-1


四.所用到的结构体的说明:

//文件信息

struct fileInfo

{

    int type;                              // 0-> datafile

                                              //1 -> index file

    CStringfileName;               // the name of thefile

    intrecordAmount;                     // thenumber of record in the file

int freeNum;              // the free block number whichcould be used for

the file

    intrecordLength;               // the lengthof the record in the file

    fileInfo*next;                            // thepointer points to the next file

    blockInfo*firstBlock;          // point to thefirst block within the file

};

// 所读取块的信息

struct blockInfo

{

    intblockNum;                     // the blocknumber of the block, which indicate

it when it be newed

    booldirtyBit;                       // 0 ->flase

                                              //1 -> indicate dirty, write back

    blockInfo*next;                  // the pointerpoint to next block

    fileInfo*file;                       // thepointer point to the file, which the block belongs to

    intcharNum;                      // thenumber of chars in the block

    char*cBlock;                      // the arrayspace for storing the records in the block in buffer

    int iTime;                            // it indicate theage of the block in use

    int lock;                              // prevent theblock from replacing

};

//定义表的索引节点信息

struct index_info

{

    CStringindex_name;          //the name of the index file

    intlength;                      //the length of the value

    char type;                     //the type of the value

                               //0---int,1---float,2----char(n)   

    longoffset;                    //the record offset in the table file

    CStringvalue;                 //the value

};

五、Index Manager主要功能

1.创建索引:现在由recordmanager调用insert_one函数实现,而当创建空索引文件时刚开始建立索引的时候需要创建一个!0000#属性的节点

2.删除索引值:根据需要,当一个记录被删除时,其相应的索引值也要删除,调用delete_one函数。

3.插入索引值:当一条记录插入时,在索引文件中也应插入其值,调用insert_one函数。

4.查找某一记录的位置:当要查找某一记录的记录偏移量时,调用search_one函数实现。

5.查找某一批记录的位置(<,<=,>,>=):当要查找某一批记录的起始记录偏移量时,调用search_many函数实现。这次返回的offset是在block中的偏移量,需要再多次调用getoffset函数来依次返回偏移量。

6.在这里通过保证第一块存放的是根节点来维持这棵树。

六、测试

1、重复插入数据

2、往空树随机插入数据

3、往多层树中插入记录

4、测试删除单个数据

5、测试单个数据搜索

6、测试范围搜索

7、测试树的分裂

七、参考代码

//////////////////////////////////////////////////////////
///----------------------------------------------------///
///       Module: Index_Manager.cpp	
///       Produced by: Birdy & C												
///       Description: produce & maintain index 
///       date:2017/5/22                      
///----------------------------------------------------///
///////////////////////////////////////////////////////////
#include"MiniSQL.h"
#include"API_Module.h"
#include"Buffer_Manager.h"
#include"Catalog_Manager.h"
#include"Record_Manager.h"
#include"Index_Manager.h"
#include"Interpreter.h"

using namespace std;


//将float型转CString
CString float_to_CString_ten(float value_float) {
	char cb[9];

	//int *temp;
	//float enscript
	int * temp2 = (int*)&value_float;
	int temp = *temp2;
	for (int i = 0; i < 8; i++)
	{
		cb[i] = (temp & 0xf) + '0';
		temp = temp >> 4;
		//printf("%c", cb[i]);
	}
	cb[8] = 0;
	CString value(cb);
	return value;
}

//将CString型转float
float CString_to_float_ten(CString value_cstring) {
	float fb;
	char *temp1 = (char*)& fb;
	char *cb = (LPSTR)(LPCTSTR)value_cstring;
	for (int i = 0; i < 4; i++)
	{
		temp1[i] = (cb[2 * i + 1] - '0') << 4;
		temp1[i] |= (cb[2 * i] - '0');
	}
	return fb;
}


CString float_to_CString_ten_new(float value_float) {
	int b = int(value_float);
	CString rem;
	int temp;
	int whole = b;
	for (int i = 0; i < 8; i++)
	{
		temp = whole % 10;
		whole /= 10;
		rem.Insert(0, char(temp + '0'));
	}
	return rem;
}

float CString_to_float_ten_new(CString value_cstring) {
	int num = _ttoi(value_cstring.Mid(0, 8));
	float a = float(num);
	return a;
}


/*
1.查找一个值等于inform.value(inform为info结构体)的记录在table表中的记录号:
函数结束返回时,记录号保存在inform.offset中,若没有,inform.offset=0。
返回值:
- 7    inform.offset = 0    索引文件为空
- 8    inform.offset = 0    表中无记录
- 1    inform.offset = 0    所查值的类型错误(int,float,char(n))
- 3    inform.offset = 0    读到异常块
正数  叶子块号, 若inform.offset > 0,记录存在表文件中,记录号为inform.offset
正数  叶子块号 ,若inform.offset=0,记录不在表文件中,返回的叶子块号在函数insert_one中要用(因为将值插入改块就够了)
*/
int search_one(CString database, CString table_name, struct index_info &inform)
{

	inform.offset = 0;
	int type_kind = inform.type;
	int type_size = inform.length;
	CString des_value = inform.value;
	blockInfo *Block_temp = get_file_block(database, table_name, 1, 1);		//type kind = 1 get from 1

																			//check if 所查值的类型错误(int,float,char(n))
	if (type_kind > 2 || type_kind < 0)
	{
		return -1;
	}
	else if (type_kind == 0 && type_size != 4)
	{
		type_size = 4;
		cout << "search_one::type int should have length of 4 bit" << endl;
	}
	else if (type_kind == 1 && type_size != 8)
	{
		type_size = 8;
		cout << "search_one::type int should have length of 8 bit" << endl;
	}


	if (NULL == Block_temp)
	{
		return -7;
	}

	CString node_value = Block_temp->cBlock;

	while (1)
	{
		if ('!' == node_value.GetAt(0))			// leaf node
		{
			int pos = find_pos(node_value, des_value, type_size, inform);
			if (pos >= 0)
			{
				inform.offset = _ttoi(node_value.Mid(5 + (5 + type_size)* pos, 5));//find the value
				if (0 == inform.offset)
				{
					cout << "索引文件为空" << endl;
				}
			}
			return Block_temp->blockNum;
		}
		else if ('?' == node_value.GetAt(0))			// normal node
		{
			int pos = find_pos(node_value, des_value, type_size, inform);
			blockInfo *ptr_temp = get_file_block(database, table_name, 1,
				_ttoi(node_value.Mid(5 + (3 + type_size)* pos, 3)));
			node_value = ptr_temp->cBlock;
			Block_temp = ptr_temp;
		}
		else
		{
			// 读到异常块
			cout << "search_one::read abnormal block" << endl;
			return -3;
		}
	}
	//读到异常块 on normal case will not reach there?
	cout << "function search_one has undefined state" << endl;
	return -3;
}

//2.查找一批>, >= , <, <= inform.value(inform为info结构体)的记录在table表中的记录号:
/*
type: 3 是 找出所有 < inform.value 的记录;
4 是 找出所有 <= inform.value 的记录;
1 是 找出所有 > inform.value; 的记录;
2 是 找出所有 >= inform.value 的记录;
函数返回时:
若没有:start=0;
否则: start=索引文件中所需记录所在的叶子节点的起始块号
end=索引文件中所需记录所在的叶子节点的终止块号
inform.offset =索引文件start起始块中的第一个所需值的偏移量(>, >= )
或则
=索引文件end 终止块中的最后一个所需值的偏移量(<, <= )
思路:分两类考虑,按照b+树的原理,遍历中间和叶子节点,直到找到起始值或终止值所在的块时,返回。

*/
void search_many(CString database, CString table_name, int& start, int& end, int type,
	struct index_info& inform)
{
	int block_num = search_one(database, table_name, inform);
	if (block_num < 0)
		return;
	if (inform.offset > 0)
	{
		if (2 == type)
		{
			end = find_right_child(database, table_name, inform);
			start = block_num;
			return;
		}
		else if (4 == type)
		{
			start = find_left_child(database, table_name, inform);
			end = block_num;
			return;
		}
	}

	blockInfo *ptr_temp = get_file_block(database, table_name, 1, block_num);
	CString temp = ptr_temp->cBlock;
	int num = _ttoi(temp.Mid(1, 4));
	int offset = find_insert(temp, inform.value, inform.length, inform);
	if (1 == type || 2 == type)
	{
		if (offset > 0)
		{
			offset = offset - 1;
		}
		else if (offset < 0)
		{
			offset = -offset;
		}
		start = block_num;
		end = find_right_child(database, table_name, inform);
		if (num <= offset)
		{
			start = find_next_leaf_sibling(database, table_name, inform, start);
			if (0 == start)
				return;
			ptr_temp = get_file_block(database, table_name, 1, start);
			temp = ptr_temp->cBlock;
			offset = 0;
		}
		inform.offset = _ttoi(temp.Mid(5 + (5 + inform.length)* offset, 5));//find the value
		return;
	}
	if (3 == type || 4 == type)
	{
		if (offset > 0)
		{
			offset = offset - 1;
		}
		else if (offset < 0)
		{
			offset = -offset - 1;
		}

		end = block_num;
		start = find_left_child(database, table_name, inform);
		if (offset == 0)
		{
			end = find_prev_leaf_sibling(database, table_name, inform, end);
			if (0 == end)
			{
				start = 0;
				return;
			}
			ptr_temp = get_file_block(database, table_name, 1, end);
			temp = ptr_temp->cBlock;
			offset = _ttoi(temp.Mid(1, 4));
		}
		offset = offset - 1;
		inform.offset = _ttoi(temp.Mid(5 + (5 + inform.length)* offset, 5));//find the value
		return;
	}
	cout << "search_many::wrong type" << endl;
	return;
}



int compare(CString x, CString y, struct index_info& inform)
{
	if (inform.type == 0)//int
	{
		int i = ASCII_to_int((LPSTR)(LPCTSTR)x);
		int j = ASCII_to_int((LPSTR)(LPCTSTR)y);
		if (i > j)  return 1;
		else if (i == j)  return 0;
		else return -1;

	}
	else if (inform.type == 1)//float
	{
		int i = ASCII_to_float((LPSTR)(LPCTSTR)x);
		int j = ASCII_to_float((LPSTR)(LPCTSTR)y);
		if (i > j)  return 1;
		else if (i == j)  return 0;
		else return -1;
	}
	else //char
	{
		if (x > y)  return 1;
		else if (x == y)  return 0;
		else return -1;
	}
}
/*
3.插入一个值的函数,值和类型在参数inform结构体中:
参数说明:
database:数据库名,cstring型;
table_name:表格文件名,cstring型
inform:info结构体,是引用。
思路:先调用search_one函数,找到要插入的叶子块块号,并判断是否值已经存在,若不存在,当叶子未满时,则找到位子插入,否则,调用insert_divide函数分割,再插入。
*/

int insert_one(CString database, CString table_name, struct index_info & inform)
{
	int offset_temp = inform.offset;
	int leaf_node = search_one(database, table_name, inform);
	int type_size = inform.length;
	CString des_value = inform.value;
	blockInfo *pleaf = get_file_block(database, table_name, 1, leaf_node);//the leaf node that insert into
	offset_temp = pleaf->fileName->recordAmount;
	CString value = pleaf->cBlock;

	int pos = find_insert(value, des_value, type_size, inform);

	//if (pos == 0)
	//{
	//}
	if (pos < 0)
	{
		pos = -pos - 1;
		int m = 5 + (5 + type_size)*pos;
		//value.Mid(m, 5);
		if ("00000" == value.Mid(m, 5))
		{
			value.Delete(m, 5);
			value.Insert(m, 5);
			pleaf->cBlock = new char[value.GetLength()];
			char *m = (LPSTR)(LPCTSTR)value;
			strcpy(pleaf->cBlock, m);

		}
		else
		{
			cout << "insert_one::already exit in the file" << endl;
			return -1;
		}
		
	}
	pos--;
	CString record = int_to_str(offset_temp) + des_value;
	value.Insert(5 + (5 + type_size)* pos, record);

	//change the total number of record
	int num = _ttoi(value.Mid(1, 4));//the whole number of record
	num++;

	//if (0)
	if (value.GetLength() + type_size + 10 < BLOCK_SIZE)
	{
		CString num2 = int_to_str(num).Right(4);
		char *temp = pleaf->cBlock;
		value.Delete(1, 4);
		value.Insert(1, num2);

		pleaf->cBlock = new char[value.GetLength()];
		char *m = (LPSTR)(LPCTSTR)value;
		strcpy(pleaf->cBlock, m);
		pleaf->fileName->recordAmount++;

	}
	else
	{
		cout << "index::insert_one:????" << endl;

		int new_leaf_node = get_new_freeblocknum(database, table_name, inform);
		blockInfo *pleaf_new = get_file_block(database, table_name, inform.length, new_leaf_node);
		CString value1, value2, temp;
		int num1 = num / 2;

		value1 = value.Mid(5, num1*(5 + type_size));
		value2 = value.Mid(5 + num1*(5 + type_size), (num - num1) * (5 + type_size));
		temp = value2.Mid(5, type_size);																//get the first value in leaf2
																										//insert the first 5 characters 
		value1.Insert(0, int_to_str(num1).Right(4));
		value2.Insert(0, int_to_str(num - num1).Right(4));
		value1.Insert(0, '!');
		value2.Insert(0, '!');

		//insert the last 3 characters
		value1.Insert(value1.GetLength(), int_to_str(new_leaf_node).Right(3));

		if (value.Right(1) == "#")
			value2.Insert(value2.GetLength(), '#');
		else
			value2.Insert(value2.GetLength(), value.Right(3));

		char *m;
		//pleaf->cBlock = (LPTSTR)(LPCTSTR)value1;
		//pleaf_new->cBlock = (LPTSTR)(LPCTSTR)value2;
		pleaf->cBlock = new char[value1.GetLength()];
		m = (LPSTR)(LPCTSTR)value1;
		strcpy(pleaf->cBlock, m);

		pleaf_new->cBlock = new char[value2.GetLength()];
		m = (LPSTR)(LPCTSTR)value2;
		strcpy(pleaf_new->cBlock, m);


		insert_divide(database, table_name, inform, leaf_node, pleaf_new->blockNum, (LPTSTR)(LPCTSTR)temp);

	}
	return 1;
}

/*
2.到叶子块满时调用函数:
参数说明:
database:数据库名,cstring型;
table_name:表格文件名,cstring型
inform:info结构体,是引用。
leaf1:原来满的叶子块的块号,此块已经分割好
leaf2:新的块号,此块保存有原来满的叶子块leaf1的右半部分
leafpoint2:指向新块leaf2的指针

思路:当节点满时,由insert_one函数调用insert_divide函数,先找出所有要分割的节点的父节点,自上而下进行分割,然后由insert_divide函数再递归调用insert_one函数进行插入(这时肯定能插入)。
*/
void insert_divide(CString database, CString table_name, struct index_info& inform,
	int leaf1, int leaf2, char* leafpoint2)
{
	int type_size = inform.length;
	int block1 = leaf1;
	int block2 = leaf2;

	int father;
	CString temp;
	blockInfo *ptr_temp;
	CString value;
	temp = leafpoint2;


	father = find_father(database, table_name, inform, block1);
	ptr_temp = get_file_block(database, table_name, 1, father);
	value = ptr_temp->cBlock;
	int num = _ttoi(value.Mid(1, 4));										//the whole number of node


	if ('?' == value.GetAt(0))													// normal node
	{

		int pos = find_pos(value, temp, type_size, inform);

		num++;
		CString num2 = int_to_str(num).Right(4);
		value.Delete(1, 4);
		value.Insert(1, num2);

		value.Insert(8 + (3 + type_size)* pos, int_to_str(block2).Right(3));
		value.Insert(8 + (3 + type_size)* pos, temp);

		ptr_temp->cBlock = (LPTSTR)(LPCTSTR)value;
		ptr_temp->cBlock = new char[value.GetLength()];
		char * m = (LPSTR)(LPCTSTR)value;
		strcpy(ptr_temp->cBlock, m);
		num++;
		//if(0)
		if (value.GetLength() + type_size + 10 < BLOCK_SIZE)			// if the node is not full
		{
			return;
		}
		else
		{							// if the node is full	

			int new_node = get_new_freeblocknum(database, table_name, inform);
			blockInfo *pleaf_new = get_file_block(database, table_name, inform.length, new_node);
			CString value1, value2;
			int num1 = num / 2;
			value1 = value.Mid(5, num1*(3 + type_size) - type_size);
			value2 = value.Mid(5 + num1*(3 + type_size), (num - num1) * (3 + type_size) - type_size);
			value1.Insert(0, int_to_str(num1 - 1).Right(4));
			value2.Insert(0, int_to_str(num - num1 - 1).Right(4));
			value1.Insert(0, '?');
			value2.Insert(0, '?');

			temp = value.Mid(5 + num1*(3 + type_size) - type_size, type_size);

			char *m;
			ptr_temp->cBlock = new char[value1.GetLength()];
			m = (LPSTR)(LPCTSTR)value1;
			strcpy(ptr_temp->cBlock, m);

			pleaf_new->cBlock = new char[value2.GetLength()];
			m = (LPSTR)(LPCTSTR)value2;
			strcpy(pleaf_new->cBlock, m);

			if (father == 1)
			{
				int head = get_new_freeblocknum(database, table_name, inform);
				blockInfo *head_node = get_file_block(database, table_name, inform.length, head);

				ptr_temp->blockNum = head;
				head_node->blockNum = 1;//确保头节点的标记为1
				CString value_head = "?0001";
				value_head.Insert(value_head.GetLength(), int_to_str(head).Right(3));
				value_head.Insert(value_head.GetLength(), temp);
				value_head.Insert(value_head.GetLength(), int_to_str(new_node).Right(3));
				head_node->cBlock = new char[value_head.GetLength()];
				m = (LPSTR)(LPCTSTR)value_head;
				strcpy(head_node->cBlock, m);
			}
			else
			{
				insert_divide(database, table_name, inform, father, new_node, (LPTSTR)(LPCTSTR)temp);
			}
		}
	}
	else
	{
		cout << "buffer::insert_divide::read unexpected block" << endl;
		return;
	}


}

//int 型数值转化成5字节的string类型 :
CString int_to_str(int value)
{
	CString rem;
	int temp;
	int whole = value;
	for (int i = 0; i < 5; i++)
	{
		temp = whole % 10;
		whole /= 10;
		rem.Insert(0, char(temp + '0'));
	}
	return rem;
}

/*
5.找出所在叶子块的前一个兄弟叶子块:
参数说明:
database:数据库名,cstring型;
table_name:表格文件名,cstring型
inform:info结构体,不是引用,
nodenum:所在叶子块块号
返回值:
- 1      函数中读到错误块;
0       没有前一个兄弟叶子块(即此块为最左孩子)
>0      前一个兄弟叶子块的块号
思路:调用find_left_child函数,找出最左孩子,并判断是否与nodenum相等,不相等的话,
遍历所有孩子,找到nodenum块时,返回他的前面节点,否则,返回0。
*/

int find_prev_leaf_sibling(CString database, CString table_name, struct index_info inform,
	int nodenum)
{
	int most_left_child = find_left_child(database, table_name, inform);
	if (most_left_child == nodenum)
	{
		return 0;
	}
	blockInfo *ptr_temp = get_file_block(database, table_name, 1, most_left_child);
	CString node_value = ptr_temp->cBlock;
	int blocknum_rem = ptr_temp->blockNum;

	while (node_value.Right(1) != "#")
	{
		ptr_temp = get_file_block(database, table_name, 1, _ttoi(node_value.Right(3)));
		node_value = ptr_temp->cBlock;
		if (ptr_temp->blockNum == nodenum)
			return blocknum_rem;
		blocknum_rem = ptr_temp->blockNum;
	}
	return -1;
}

/*
6.找出所在叶子块的后一个兄弟叶子块:
- 1      函数中读到错误块;
0       没有后一个兄弟叶子块(即此块为最右孩子)
>0      后一个兄弟叶子块的块号
思路:调用find_right_child函数,找出最右孩子,并判断是否与nodenum相等,不相等的话,遍历所有孩子,找到nodenum块时,返回他的后面节点,否则,返回0。
*/

int find_next_leaf_sibling(CString database, CString table_name, struct index_info inform,
	int nodenum)
{
	blockInfo *ptr_temp = get_file_block(database, table_name, inform.length, nodenum);
	CString node_value = ptr_temp->cBlock;
	if (node_value.Right(1) == "#")
	{
		return 0;
	}
	return  _ttoi(node_value.Right(3));
}

/*
7.找出最左叶子块的块号:
- 1      函数中读到错误块;
0       索引文件空
>0      最左叶子块的块号
*/
int find_left_child(CString database, CString table_name, index_info  inform)
{
	int type_kind = inform.type;
	int type_size = inform.length;
	blockInfo *Block_temp = get_file_block(database, table_name, 1, 1);
	CString node_value = Block_temp->cBlock;

	while (1)
	{
		if ('!' == node_value.GetAt(0))			// leaf node
		{
			return Block_temp->blockNum;
		}
		else if ('?' == node_value.GetAt(0))			// normal node
		{
			int num = _ttoi(node_value.Mid(1, 4));     //the whole number of node
			if (0 == num)
				return -1;
			blockInfo *ptr_temp = get_file_block(database, table_name, 1,
				_ttoi(node_value.Mid(5, 3)));
			node_value = ptr_temp->cBlock;
			Block_temp = ptr_temp;
		}
		else
		{
			// 读到异常块
			cout << "find_left_child::read unnormal block" << endl;
			return -1;
		}
	}

	//读到异常块 on normal case will not reach there
	cout << "function find_left_child has undefined state" << endl;
	return -1;
}

/*
8.找出最右叶子块的块号:
- 1      函数中读到错误块;
0       索引文件空
>0      最右叶子块的块号
*/

int find_right_child(CString database, CString table_name, index_info  inform)
{
	int type_kind = inform.type;
	int type_size = inform.length;
	blockInfo *Block_temp = get_file_block(database, table_name, 1, 1);
	CString node_value = Block_temp->cBlock;

	while (1)
	{
		if ('!' == node_value.GetAt(0))			// leaf node
		{
			return Block_temp->blockNum;
		}
		else if ('?' == node_value.GetAt(0))			// normal node
		{
			int num = _ttoi(node_value.Mid(1, 4));     //the whole number of node
			if (0 == num)
				return -1;

			blockInfo *ptr_temp = get_file_block(database, table_name, 1,
				_ttoi(node_value.Mid(5 + (3 + type_size)* num, 3)));
			node_value = ptr_temp->cBlock;
			Block_temp = ptr_temp;
		}
		else
		{
			// 读到异常块
			cout << "find_right_child::read unnormal block" << endl;
			return -1;
		}
	}
	//读到异常块 on normal case will not reach there?
	cout << "function find_right_child has undefined state" << endl;
	return -1;
}

/*
9.获取一个空块的块号,并使总块数增一
0      读到异常块,或者块数>256块
>0     空块号
思路:判断块数是否>256,若没有,判断空块链表是否为空,为空则得到(现有总块号数
+1)的空块,否则,读取空块链表首,修改链表;并使总块数+1。
*/

int get_new_freeblocknum(CString database, CString table_name, struct index_info& inform)
{
	blockInfo *temp = findBlock(database);
	fileInfo * ptrDatabase;
	ptrDatabase = get_file_info(database, table_name, 1);
	temp->blockNum = ++ptrDatabase->recordAmount;
	replace(ptrDatabase, temp);
	return temp->blockNum;
	/*
	if (ptrDatabase->recordAmount > 256)
	return 0;

	if (ptrDatabase->freeNum)
	{
	int freeNum = ptrDatabase->freeNum;
	blockInfo *ptr_temp = get_file_block(database, table_name, inform.length, freeNum);
	CString m = ptr_temp->cBlock;
	ptrDatabase->freeNum = _ttoi(m);
	return freeNum;
	}
	else
	{
	blockInfo *newBlock = (blockInfo *)malloc(sizeof(struct blockInfo));
	//initilize the Block for other part
	newBlock->blockNum = ++ptrDatabase->recordAmount;
	newBlock->charNum = 3;
	newBlock->dirtyBit = 0;
	newBlock->cBlock = "000";
	newBlock->next = ptrDatabase->firstBlock;
	ptrDatabase->firstBlock = newBlock;
	return newBlock->blockNum;
	}
	cout << "get_new_freeblocknum :: error in get_new_freeblocknum" << endl;
	return -1;
	*/
}

/*
10.找块num的父亲块的块号(在查找inform.value的路径上)
参数说明:
database:数据库名,cstring型;
table_name:表格文件名,cstring型
inform:info结构体,不是引用,
num:所在块的块号
返回值:
- 1      函数中读到错误块;
0       没有父亲块(即此块为根)
>0      父亲块的块号
思路:根据B+树的原理,查找值为inform.value所在的叶子节点,若中途碰到块号为
num的数据块,则返回他的父亲节点
*/

int find_father(CString database, CString table_name, index_info inform, int num)
{
	inform.offset = 0;
	int type_size = inform.length;
	CString des_value = inform.value;
	blockInfo *Block_temp = get_file_block(database, table_name, 1, 1);

	//索引文件为空
	if (NULL == Block_temp)
	{
		return -8;
	}
	int block_num = Block_temp->blockNum;

	CString node_value = Block_temp->cBlock;

	while (1)
	{
		if (num == Block_temp->blockNum)
			return block_num;
		else
			block_num = Block_temp->blockNum;


		if ('!' == node_value.GetAt(0))						// leaf node
		{
			int pos = find_pos(node_value, des_value, type_size, inform);
			if (pos < 0)
				return -1;
			else
			{
				return 0;
			}
		}
		else if ('?' == node_value.GetAt(0))				// normal node
		{
			int pos = find_pos(node_value, des_value, type_size, inform);
			blockInfo *ptr_temp = get_file_block(database, table_name, 1,
				_ttoi(node_value.Mid(5 + (3 + type_size)* pos, 3)));
			node_value = ptr_temp->cBlock;
			Block_temp = ptr_temp;
		}
		else
		{
			// 读到异常块
			cout << "find_father :: read abnormal block" << endl;
			return -1;
		}
	}
	//读到异常块 on normal case will not reach there?
	cout << "find_father :: function search_one has undefined state" << endl;
	return 0;


}

/*11.*/
void delete_one(CString database, CString table_name,
	struct index_info & inform)
{
	fileInfo * ptrDatabase;
	ptrDatabase = get_file_info(database, table_name, 1);
	int Block_num = search_one(database, table_name, inform);
	if (inform.offset <= 0)
	{
		cout << "delete_one::删除的数据不存在表中" << endl;
		return;
	}
	blockInfo *Block = get_file_block(database, table_name, inform.length, Block_num);
	CString value = Block->cBlock;
	int pos = find_pos(value, inform.value, inform.length, inform);
	pos = 5 + (5 + inform.length)* pos;
	value.Delete(pos, 5);
	value.Insert(pos, "00000");
	//(LPSTR)(LPCTSTR)num2;
	Block->cBlock = new char[value.GetLength()];
	char *m = (LPSTR)(LPCTSTR)value;
	strcpy(Block->cBlock, m);


}

/*
12.
查找 返回位置
不存在返回-1
*/
int find_pos(CString value, CString des_value, int type_size, index_info  inform)
{

	CString node_value = value;
	int num = _ttoi(node_value.Mid(1, 4));						//the whole number of record

	if ('!' == node_value.GetAt(0))							// leaf node
	{
		int left, right, middle;
		left = 0;
		right = num - 1;
		//the following loop is a Binary Search
		while (left <= right)
		{
			middle = (left + right) / 2;
			if (compare(node_value.Mid(10 + (5 + type_size)* middle, type_size), des_value, inform) > 0)
			{
				right = middle - 1;
			}
			else if (compare(node_value.Mid(10 + (5 + type_size)* middle, type_size), des_value, inform) < 0)
			{
				left = middle + 1;
			}
			else
			{
				return middle;
			}
		}
		return -1;
	}
	else if ('?' == node_value.GetAt(0))			// normal node
	{
		int left, right, middle;
		left = 0;
		right = num;
		//the following loop is a Binary Search
		while (left < right)
		{
			middle = (left + right) / 2;
			if (compare(node_value.Mid(8 + (3 + type_size)* middle, type_size), des_value, inform) > 0)
			{
				right = middle;
			}
			else if (compare(node_value.Mid(8 + (3 + type_size)* middle, type_size), des_value, inform) < 0)
			{
				left = middle + 1;
			}
			else
			{
				left = right = middle + 1;
				break;
			}
		}

		return left;
	}
	else
	{
		cout << " find_pos::读到异常块 " << endl;
		return -3;

	}
}

/*
13.
返回插入位置 所在区间 1~n
异常返回 0
对于叶子 如果已经存在表中存在 -pos -1
*/

int find_insert(CString value, CString des_value, int type_size, index_info  inform)
{
	CString node_value = value;

	int num = _ttoi(node_value.Mid(1, 4));//the whole number of record
	if (0 == num)
		return 1;
	int left, right, middle;
	if ('!' == node_value.GetAt(0))							// leaf node
	{
		left = 0;
		right = num;
		//the following loop is a Binary Search
		while (left < right)
		{
			middle = (left + right) / 2;
			if (compare(node_value.Mid(10 + (5 + type_size)* middle, type_size), des_value, inform) > 0)
			{
				right = middle;
			}
			else if (compare(node_value.Mid(10 + (5 + type_size)* middle, type_size), des_value, inform) < 0)
			{
				left = middle + 1;
			}
			else
			{
				cout << "insert_one::input already exit in the table" << endl;
				return -middle - 1;
			}
		}
		return left + 1;
	}
	else
	{
		cout << " find_pos::读到异常块 " << endl;
		return 0;
	}
}

//flag 1是正向
int getoffset(CString database, CString table_name, int &start, int &end, struct index_info & inform, int flag)
{
	int real_offset = -1;
	if (0 == start)
		return -1;

	if (1 == flag)
	{
		blockInfo *Block_temp = get_file_block(database, table_name, start, 1);		//type kind = 1 get from 1
		CString node_value(Block_temp->cBlock);
		real_offset = _ttoi(node_value.Mid(5 + (5 + inform.length)* inform.offset, 5));//find the value
		int num = _ttoi(node_value.Mid(1, 4));

		if (num > inform.offset + 1) //num 从1开始 offset 从0开始
		{
			inform.offset++;
		}
		else
		{
			int next_node = find_next_leaf_sibling(database, table_name, inform, start);
			if (next_node > 0)
			{
				start = next_node;
				inform.offset = 0;
			}
			else
			{
				start = 0;
			}
		}
	}
	else if (-1 == flag)
	{

		blockInfo *Block_temp = get_file_block(database, table_name, end, 1);		//type kind = 1 get from 1
		CString node_value(Block_temp->cBlock);
		real_offset = _ttoi(node_value.Mid(5 + (5 + inform.length)* inform.offset, 5));//find the value

		if (0 < inform.offset)//num 从1开始 offset 从0开始
		{
			inform.offset--;
		}
		else
		{

			int pre_node = find_prev_leaf_sibling(database, table_name, inform, start);
			if (pre_node > 0)
			{
				blockInfo *Block_temp1 = get_file_block(database, table_name, pre_node, 1);		//type kind = 1 get from 1
				CString node_value1(Block_temp1->cBlock);
				int num = _ttoi(node_value1.Mid(1, 4));
				end = pre_node;
				inform.offset = num - 1;
			}
			else
			{
				start = 0;
			}
		}
	}
	else
	{
		cout << "index::getoffset::wrong flag type" << endl;
		return -3;
	}
	return real_offset;
}



推荐阅读