Fuzzing Like A Caveman
- Fuzzing Like A Caveman:利用Python写了个简单的fuzzer,主要包括bit翻转和魔法数字替换两种fuzz方法
- Fuzzing Like A Caveman 2: Improving Performance 改进了fuzzer一些方法,讲述了
[cProfile](<https://docs.python.org/zh-cn/3/library/profile.html#module-cProfile>)
评估Python程序性能,利用popen改进了原有的执行目标程序方法,又用C++和C重写了Fuzzer - Fuzzing Like A Caveman 3: Trying to Somewhat Understand The Importance Code Coverage
- Fuzzing Like A Caveman 4: Snapshot/Code Coverage Fuzzer!
- Fuzzing Like A Caveman 5: A Code Coverage Tour for Cavepeople
C++版本Fuzzer解释
//
// this function simply creates a stream by opening a file in binary mode;
// finds the end of file, creates a string 'data', resizes data to be the same
// size as the file moves the file pointer back to the beginning of the file;
// reads the data from the into the data string;
//
std::string get_bytesstring filename
{
std::ifstream fin(filename, std::ios::binary);
if (fin.is_open())
{
fin.seekg(0, std::ios::end);
std::string data;
data.resize(fin.tellg());
fin.seekg(0, std::ios::beg);
fin.read(&data[0], data.size());
return data;
}
else
{
std::cout << "Failed to open " << filename << ".n";
exit(1);
}
}
//seekg与seekp
//seekp 可用于将信息 put(放入)到文件中,而 seekg 则可用于从文件中 get(获取)信息
//file.seekg(2L, ios::beg);将读取位置设置为从文件开头开始的第 3 个字节(字节 2)
//http://c.biancheng.net/view/1541.html
//tellg与tellp
//tellp 用于返回写入位置,tellg 则用于返回读取位置
//resize调整容器中有效数据区域的尺寸,如果尺寸变小,原来数据多余的截掉。若尺寸变大,不够的数据用该函数第二个参数填充,影响size。
//read 第一个参数是用于接收字节的数组,第二个参数是读多少个字节到第一个参数
PS:就是把filename文件中内容赋值给data的过程,咋这么麻烦
bit_flip函数
//
// this will take 1% of the bytes from our valid jpeg and
// flip a random bit in the byte and return the altered string
//
std::string bit_flipstring data
{
int size = (data.length() - 4);
int num_of_flips = (int)(size * .01);
// get a vector full of 1% of random byte indexes
std::vector<int> picked_indexes;
for (int i = 0; i < num_of_flips; i++)
{
int picked_index = rand() % size;
picked_indexes.push_back(picked_index);
}
// iterate through the data string at those indexes and flip a bit
for (int i = 0; i < picked_indexes.size(); ++i)
{
int index = picked_indexes[i];
char current = data.at(index);
int decimal = ((int)current & 0xff);
int bit_to_flip = rand() % 8;
decimal ^= 1 << bit_to_flip;
decimal &= 0xff;
data[index] = (char)decimal;
}
return data;
}
create_new函数的替代
//
// takes mutated string and creates new jpeg with it;
//
void create_newstring mutated
{
std::ofstream fout("mutated.jpg", std::ios::binary);
if (fout.is_open())
{
fout.seekp(0, std::ios::beg);
fout.write(&mutated[0], mutated.size());
}
else
{
std::cout << "Failed to create mutated.jpg" << ".n";
exit(1);
}
}
get_output将传入的字符串当做命令执行
exif函数则是调用get_output函数并且判定输出是否有crash的标志,并做对应的处理动作
//
// function to run a system command and store the output as a string;
// <https://www.jeremymorgan.com/tutorials/c-programming/how-to-capture-the-output-of-a-linux-command-in-c/>
//
std::string get_outputstring cmd
{
std::string output;
FILE * stream;
char buffer[256];
stream = popen(cmd.c_str(), "r");
if (stream)
{
while (!feof(stream))
if (fgets(buffer, 256, stream) != NULL) output.append(buffer);
pclose(stream);
}
return output;
}
//
// we actually run our exiv2 command via the get_output() func;
// retrieve the output in the form of a string and then we can parse the string;
// we'll save all the outputs that result in a segfault or floating point except;
//
void exifstring mutated, int counter
{
std::string command = "exif mutated.jpg -verbose 2>&1";
std::string output = get_output(command);
std::string segfault = "Segmentation";
std::string floating_point = "Floating";
std::size_t pos1 = output.find(segfault);
std::size_t pos2 = output.find(floating_point);
if (pos1 != -1)
{
std::cout << "Segfault!n";
std::ostringstream oss;
oss << "/root/cppcrashes/crash." << counter << ".jpg";
std::string filename = oss.str();
std::ofstream fout(filename, std::ios::binary);
if (fout.is_open())
{
fout.seekp(0, std::ios::beg);
fout.write(&mutated[0], mutated.size());
}
else
{
std::cout << "Failed to create " << filename << ".jpg" << ".n";
exit(1);
}
}
else if (pos2 != -1)
{
std::cout << "Floating Point!n";
std::ostringstream oss;
oss << "/root/cppcrashes/crash." << counter << ".jpg";
std::string filename = oss.str();
std::ofstream fout(filename, std::ios::binary);
if (fout.is_open())
{
fout.seekp(0, std::ios::beg);
fout.write(&mutated[0], mutated.size());
}
else
{
std::cout << "Failed to create " << filename << ".jpg" << ".n";
exit(1);
}
}
}
另一种Fuzz的变异方式定义
//
// simply generates a vector of strings that are our 'magic' values;
//
std::vector<std::string> vector_gen()
{
std::vector<std::string> magic;
using namespace std::string_literals;
magic.push_back("xff");
magic.push_back("x7f");
magic.push_back("x00"s);
magic.push_back("xffxff");
magic.push_back("x7fxff");
magic.push_back("x00x00"s);
magic.push_back("xffxffxffxff");
magic.push_back("x80x00x00x00"s);
magic.push_back("x40x00x00x00"s);
magic.push_back("x7fxffxffxff");
return magic;
}
//
// randomly picks a magic value from the vector and overwrites that many bytes in the image;
//
std::string magicstring data, std::vector<std::string> magic
{
int vector_size = magic.size();
int picked_magic_index = rand() % vector_size;
std::string picked_magic = magic[picked_magic_index];
int size = (data.length() - 4);
int picked_data_index = rand() % size;
data.replace(picked_data_index, magic[picked_magic_index].length(), magic[picked_magic_index]);
return data;
}
//
// returns 0 or 1;
//
int func_pick()
{
int result = rand() % 2;
return result;
}
main函数
int main(int argc, char** argv)
{
if (argc < 3)
{
std::cout << "Usage: ./cppfuzz <valid jpeg> <number_of_fuzzing_iterations>n";
std::cout << "Usage: ./cppfuzz Canon_40D.jpg 10000n";
return 1;
}
// start timer
auto start = std::chrono::high_resolution_clock::now();
// initialize our random seed
srand((unsigned)time(NULL));
// generate our vector of magic numbers
std::vector<std::string> magic_vector = vector_gen();
std::string filename = argv[1];
int iterations = atoi(argv[2]);
int counter = 0;
while (counter < iterations)
{
std::string data = get_bytes(filename);
int function = func_pick();
function = 1;
if (function == 0)
{
// utilize the magic mutation method; create new jpg; send to exiv2
std::string mutated = magic(data, magic_vector);
create_new(mutated);
exif(mutated,counter);
counter++;
}
else
{
// utilize the bit flip mutation; create new jpg; send to exiv2
std::string mutated = bit_flip(data);
create_new(mutated);
exif(mutated,counter);
counter++;
}
}
// stop timer and print execution time
auto stop = std::chrono::high_resolution_clock::now();
auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(stop - start);
std::cout << "Execution Time: " << duration.count() << "msn";
return 0;
}
总结:能学到不少C++开发的技巧,基本都是用的Python,C++基本上都忘完了
C版本Fuzzer优化
主要是利用fork、execvp这些调用来取代popen函数
fork调用的一个奇妙之处就是它仅仅被调用一次,却能够返回两次,它可能有三种不同的返回值:
1)在父进程中,fork返回新创建子进程的进程ID;
2)在子进程中,fork返回0;
3)如果出现错误,fork返回一个负值;
4代码解读
long long unsigned start_addr = 0x555555554b41; //
long long unsigned end_addr = 0x5555555548c0; //
vuln.bp_addresses[0] = 0x555555554b7d;
vuln.bp_addresses[1] = 0x555555554bb9;
在start和end下了两个断点,在第一个断点的时候读取寄存器的值,去除第一个断点,然后把rip的值-1然后重新赋值给register,
动态断点:下在check2和check3开头
C语言lseek()函数:移动文件的读写位置
头文件:
#include <sys/types.h> #include <unistd.h>
定义函数:
off_t lseek(**int**
fildes, off_t offset, **int**
whence);
函数说明:每一个已打开的文件都有一个读写位置, 当打开文件时通常其读写位置是指向文件开头, 若是以附加的方式打开文件(如O_APPEND), 则读写位置会指向文件尾. 当read()或write()时, 读写位置会随之增加,lseek()便是用来控制该文件的读写位置. 参数fildes 为已打开的文件描述词, 参数offset 为根据参数whence来移动读写位置的位移数.
参数 whence 为下列其中一种:
- SEEK_SET 参数offset 即为新的读写位置.
- SEEK_CUR 以目前的读写位置往后增加offset 个位移量.
- SEEK_END 将读写位置指向文件尾后再增加offset 个位移量. 当whence 值为SEEK_CUR 或
- SEEK_END 时, 参数offet 允许负值的出现.
下列是教特别的使用方式:1) 欲将读写位置移到文件开头时:lseek(int fildes, 0, SEEK_SET);2) 欲将读写位置移到文件尾时:lseek(int fildes, 0, SEEK_END);3) 想要取得目前文件位置时:lseek(int fildes, 0, SEEK_CUR);
返回值:当调用成功时则返回目前的读写位置, 也就是距离文件开头多少个字节. 若有错误则返回-1, errno 会存放错误代码.
相关函数:readdir, write, fcntl, close, lseek, readlink, fread
头文件:#include <unistd.h>
定义函数:ssize_t read(int fd, void * buf, size_t count);
函数说明:read()会把参数fd 所指的文件传送count 个字节到buf 指针所指的内存中. 若参数count 为0, 则read()不会有作用并返回0. 返回值为实际读取到的字节数, 如果返回0, 表示已到达文件尾或是无可读取的数据,此外文件读写位置会随读取到的字节移动.
附加说明:
如果顺利 read()会返回实际读到的字节数, 最好能将返回值与参数count 作比较, 若返回的字节数比要求读取的字节数少, 则有可能读到了文件尾、从管道(pipe)或终端机读? ?蛘呤莚ead()被信号中断了读取动作.
当有错误发生时则返回-1, 错误代码存入errno 中, 而文件读写位置则无法预期.
错误代码:EINTR 此调用被信号所中断.EAGAIN 当使用不可阻断I/O 时(O_NONBLOCK), 若无数据可读取则返回此值.EBADF 参数fd 非有效的文件描述词, 或该文件已关闭.
Fuzzer开发
前置知识
C函数如何返回字符串?
强制转换int类型
(type_name) expression
新建动态大小数组
C语言实现动态数组,克服静态数组大小固定的缺陷_C语言中文网