深入理解计算机系统(英文版.第2版)(双色印刷,计算机软硬件理论结合讲述的经典之作)(china-pub首发)
基本信息
- 作者: (美)Randal E.Bryant David R. O'Hallaron [作译者介绍]
- 丛书名: 经典原版书库
- 出版社:机械工业出版社
- ISBN:9787111326311
- 上架时间:2011-1-17
- 出版日期:2011 年1月
- 开本:16开
- 页码:1077
- 版次:2-1
- 所属分类:
计算机 > 计算机组织与体系结构 > 综合
编辑推荐
双色印刷,计算机软硬件理论结合讲述的经典之作,被誉为“价值超过等重量黄金的无价资源宝库”
Amazon五星图书,卡耐基梅隆大学计算机学院院长、IEEE、ACM和美国工程院院士倾力奉献,中英文版同步上市!
推荐阅读
内容简介回到顶部↑
计算机书籍
本书是一本将计算机软件和硬件理论结合讲述的经典教程,内容覆盖计算机导论、体系结构和处理器设计等多门课程。本书的最大优点是为程序员描述计算机系统的实现细节,通过描述程序是如何映射到系统上,以及程序是如何执行的,使读者更好地理解程序的行为为什么是这样的,以及造成效率低下的原因。
相对于第1版,本版主要是反映了过去十年间硬件技术和编译器的变化,具体更新如下:
1. 对系统的介绍(特别是实际使用部分)做了增加和修改。例如,既保持了原有的针对32位系统的说明,又增加了对64位系统的描述。
2. 增加了很多关于由算术运算溢出以及缓冲区溢出造成安全漏洞的内容。
3. 更详细讲述了处理器对异常的发现和处理。
4. 描述了基于intel core i7处理器的存储器层次结构,还增加了固态硬盘的内容。
5. 强调并发性,增加了关于并发性一般原则的内容。
作译者回到顶部↑
本书提供作译者介绍
作者: David R. O’Hallaron
David R. O’Hallaron 现为Intel匹兹堡实验室主任,卡内基梅隆大学电子和计算机工程系副教授.在弗吉尼亚大学(UniversitycofcVirginia)获得计算机科学的博士学位.
466他教授本科生和研究生的计算机系统方面的课程,例如计算机体系结构、计算机系统导论、并行处理器设计和Internet服务.他和Bryant教授一起开设了“计算机系统导论”课程,那便是此书的基础. 2004年他获得了CMU计算机学院颁发的Herbert Simon杰出教学奖,这个奖项的获得者是基于学生的投票产生的.
O’Hallaron教授从事计算机系统.. <<
查看详细
[同作者作品]
深入理解计算机系统(修订版)(09年度畅销榜TOP50)(08年度畅销榜TOP50)
深入理解计算机系统(原书第2版)(Amazon五星图书,被誉为“价值超过等重量黄金的无价资源宝库”)
深入理解计算机系统(英文版.第2版)(双色印刷,计算机软硬件理论结合讲述的经典之作)(china-pub首发)
作者: Randal E. Bryant
Randal E. Bryant 1973年于密歇根大学(University of Michigan)获得学士学位,随即就读于麻省理工学院(Massachusetts Institute of Technology)的研究生院,并在1981年获计算机博士学位。他在加州理工学院(California Institute of Technology)做了三年助教,从1984年至今一直是卡内基梅隆大学(Carnegie Mellon)的教师。他现在是计算机科学的大学教授(university professor)和计算机科学学院的院长。他同时还受邀于电子和计算机工程系。.. <<
查看详细
[同作者作品]
深入理解计算机系统(修订版)(09年度畅销榜TOP50)(08年度畅销榜TOP50)
深入理解计算机系统(原书第2版)(Amazon五星图书,被誉为“价值超过等重量黄金的无价资源宝库”)
深入理解计算机系统(英文版.第2版)(双色印刷,计算机软硬件理论结合讲述的经典之作)(china-pub首发)
目录回到顶部↑
preface 7
about the authors 21
1 a tour of computer systems 35
1.1 information is bits + context 37
1.2 programs are translated by other programs into different forms 38
1.3 it pays to understand how compilation systems work 40
1.4 processors read and interpret instructions stored in memory 41
1.4.1 hardware organization of a system 41
1.4.2 running the hello program 44
1.5 caches matter 46
1.6 storage devices form a hierarchy 47
1.7 the operating system manages the hardware 48
1.7.1 processes 50
1.7.2 threads 51
1.7.3 virtual memory 51
1.7.4 files 53
1.8 systems communicate with other systems using networks 54
1.9 important themes 55
1.9.1 concurrency and parallelism 55
1.10 summary 59
bibliographic notes 60
part i program structure and execution
2 representing and manipulating information 63
2.1 information storage 67
2.1.1 hexadecimal notation 68
2.1.2 words 72
2.1.3 data sizes 72
computer systems contents
a programmer’s perspective, 2e
24 contents
2.1.4 addressing and byte ordering 73
2.1.5 representing strings 80
2.1.6 representing code 81
2.1.7 introduction to boolean algebra 82
2.1.8 bit-level operations in c 85
2.1.9 logical operations in c 88
2.1.10 shift operations in c 88
2.2 integer representations 90
2.2.1 integral data types 91
2.2.2 unsigned encodings 92
2.2.3 two’s-complement encodings 94
2.2.4 conversions between signed and unsigned 99
2.2.5 signed vs. unsigned in c 103
2.2.6 expanding the bit representation of a number 105
2.2.7 truncating numbers 109
2.2.8 advice on signed vs. unsigned 110
2.3 integer arithmetic 113
2.3.1 unsigned addition 113
2.3.2 two’s-complement addition 117
2.3.3 two’s-complement negation 121
2.3.4 unsigned multiplication 122
2.3.5 two’s-complement multiplication 123
2.3.6 multiplying by constants 126
2.3.7 dividing by powers of two 129
2.3.8 final thoughts on integer arithmetic 132
2.4 floating point 133
2.4.1 fractional binary numbers 134
2.4.2 ieee floating-point representation 137
2.4.3 example numbers 139
2.4.4 rounding 144
2.4.5 floating-point operations 147
2.4.6 floating point in c 148
2.5 summary 152
bibliographic notes 153
homework problems 153
solutions to practice problems 168
3 machine-level representation of programs 187
3.1 a historical perspective 190
3.2 program encodings 193
contents 25
3.2.1 machine-level code 194
3.2.2 code examples 196
3.2.3 notes on formatting 199
3.3 data formats 201
3.4 accessing information 202
3.4.1 operand specifiers 203
3.4.2 data movement instructions 205
3.4.3 data movement example 208
3.5 arithmetic and logical operations 211
3.5.1 load effective address 211
3.5.2 unary and binary operations 212
3.5.3 shift operations 213
3.5.4 discussion 214
3.5.5 special arithmetic operations 216
3.6 control 219
3.6.1 condition codes 219
3.6.2 accessing the condition codes 221
3.6.3 jump instructions and their encodings 223
3.6.4 translating conditional branches 227
3.6.5 loops 231
3.6.6 conditional move instructions 240
3.6.7 switch statements 247
3.7 procedures 253
3.7.1 stack frame structure 253
3.7.2 transferring control 255
3.7.3 register usage conventions 257
3.7.4 procedure example 258
3.7.5 recursive procedures 263
3.8 array allocation and access 266
3.8.1 basic principles 266
3.8.2 pointer arithmetic 267
3.8.3 nested arrays 269
3.8.4 fixed-size arrays 271
3.8.5 variable-size arrays 272
3.9 heterogeneous data structures 275
3.9.1 structures 275
3.9.2 unions 278
3.9.3 data alignment 282
3.10 putting it together: understanding pointers 286
3.11 life in the realworld: using the gdb debugger 288
3.12 out-of-bounds memory references and buffer overflow 290
3.12.1 thwarting buffer overflow attacks 295
26 contents
3.13 x86-64: extending ia32 to 64 bits 301
3.13.1 history and motivation for x86-64 302
3.13.2 an overview of x86-64 304
3.13.3 accessing information 307
3.13.4 control 313
3.13.5 data structures 324
3.13.6 concluding observations about x86-64 325
3.14 machine-level representations of floating-point programs 326
3.15 summary 327
bibliographic notes 328
homework problems 328
solutions to practice problems 342
4 processor architecture 367
4.1 the y86 instruction set architecture 370
4.1.1 programmer-visible state 370
4.1.2 y86 instructions 371
4.1.3 instruction encoding 373
4.1.4 y86 exceptions 378
4.1.5 y86 programs 379
4.1.6 some y86 instruction details 384
4.2 logic design and the hardware control language hcl 386
4.2.1 logic gates 387
4.2.2 combinational circuits and hcl boolean expressions 388
4.2.3 word-level combinational circuits and hcl integer
expressions 389
4.2.4 set membership 394
4.2.5 memory and clocking 395
4.3 sequential y86 implementations 398
4.3.1 organizing processing into stages 398
4.3.2 seq hardware structure 409
4.3.3 seq timing 413
4.3.4 seq stage implementations 417
4.4 general principles of pipelining 425
4.4.1 computational pipelines 426
4.4.2 a detailed look at pipeline operation 427
4.4.3 limitations of pipelining 428
4.4.4 pipelining a system with feedback 432
4.5 pipelined y86 implementations 434
4.5.1 seq+: rearranging the computation stages 434
contents 27
4.5.2 inserting pipeline registers 435
4.5.3 rearranging and relabeling signals 439
4.5.4 next pc prediction 440
4.5.5 pipeline hazards 442
4.5.6 avoiding data hazards by stalling 447
4.5.7 avoiding data hazards by forwarding 449
4.5.8 load/use data hazards 452
4.5.9 exception handling 454
4.5.10 pipe stage implementations 457
4.5.11 pipeline control logic 465
4.5.12 performance analysis 478
4.5.13 unfinished business 480
4.6 summary 483
4.6.1 y86 simulators 484
bibliographic notes 485
homework problems 485
solutions to practice problems 491
5 optimizing program performance 507
5.1 capabilities and limitations of optimizing compilers 510
5.2 expressing program performance 514
5.3 program example 516
5.4 eliminating loop inefficiencies 520
5.5 reducing procedure calls 524
5.6 eliminating unneeded memory references 525
5.7 understanding modern processors 530
5.7.1 overall operation 531
5.7.2 functional unit performance 534
5.7.3 an abstract model of processor operation 536
5.8 loop unrolling 543
5.9 enhancing parallelism 547
5.9.1 multiple accumulators 548
5.9.2 reassociation transformation 552
5.10 summary of results for optimizing combining code 558
5.11 some limiting factors 559
5.11.1 register spilling 559
5.11.2 branch prediction and misprediction penalties 560
5.12 understanding memory performance 565
5.12.1 load performance 565
5.12.2 store performance 566
28 contents
5.13 life in the real world: performance improvement techniques 573
5.14 identifying and eliminating performance bottlenecks 574
5.14.1 program profiling 574
5.14.2 using a profiler to guide optimization 576
5.14.3 amdahl’s law 579
5.15 summary 581
bibliographic notes 582
homework problems 583
solutions to practice problems 586
6 the memory hierarchy 593
6.1 storage technologies 595
6.1.1 random-access memory 595
6.1.2 disk storage 604
6.1.3 solid state disks 615
6.1.4 storage technology trends 617
6.2 locality 620
6.2.1 locality of references to program data 621
6.2.2 locality of instruction fetches 622
6.2.3 summary of locality 623
6.3 the memory hierarchy 625
6.3.1 caching in the memory hierarchy 626
6.3.2 summary of memory hierarchy concepts 629
6.4 cache memories 630
6.4.1 generic cache memory organization 631
6.4.2 direct-mapped caches 633
6.4.3 set associative caches 640
6.4.4 fully associative caches 642
6.4.5 issues with writes 645
6.4.6 anatomy of a real cache hierarchy 646
6.4.7 performance impact of cache parameters 648
6.5 writing cache-friendly code 649
6.6 putting it together: the impact of caches on program performance 654
6.6.1 the memory mountain 655
6.6.2 rearranging loops to increase spatial locality 659
6.6.3 exploiting locality in your programs 663
6.7 summary 663
bibliographic notes 664
homework problems 665
solutions to practice problems 676
contents 29
part ii running programs on a system
7 linking 687
7.1 compiler drivers 689
7.2 static linking 691
7.3 object files 691
7.4 relocatable object files 692
7.5 symbols and symbol tables 694
7.6 symbol resolution 697
7.6.1 how linkers resolve multiply defined global symbols 698
7.6.2 linking with static libraries 701
7.6.3 how linkers use static libraries to resolve references 704
7.7 relocation 706
7.7.1 relocation entries 706
7.7.2 relocating symbol references 707
7.8 executable object files 712
7.9 loading executable object files 713
7.10 dynamic linking with shared libraries 715
7.11 loading and linking shared libraries from applications 717
7.12 position-independent code (pic) 721
7.13 tools for manipulating object files 724
7.14 summary 725
bibliographic notes 725
homework problems 726
solutions to practice problems 732
8 exceptional control flow 735
8.1 exceptions 737
8.1.1 exception handling 738
8.1.2 classes of exceptions 740
8.1.3 exceptions in linux/ia32 systems 742
8.2 processes 746
8.2.1 logical control flow 746
8.2.2 concurrent flows 747
8.2.3 private address space 748
8.2.4 user and kernel modes 748
8.2.5 context switches 750
30 contents
8.3 system call error handling 751
8.4 process control 752
8.4.1 obtaining process ids 753
8.4.2 creating and terminating processes 753
8.4.3 reaping child processes 757
8.4.4 putting processes to sleep 763
8.4.5 loading and running programs 764
8.4.6 using fork and execve to run programs 767
8.5 signals 770
8.5.1 signal terminology 772
8.5.2 sending signals 773
8.5.3 receiving signals 776
8.5.4 signal handling issues 779
8.5.5 portable signal handling 786
8.5.6 explicitly blocking and unblocking signals 787
8.5.7 synchronizing flows to avoid nasty concurrency bugs 789
8.6 nonlocal jumps 793
8.7 tools for manipulating processes 796
8.8 summary 797
bibliographic notes 797
homework problems 798
solutions to practice problems 805
9 virtual memory 809
9.1 physical and virtual addressing 811
9.2 address spaces 812
9.3 vm as a tool for caching 813
9.3.1 dram cache organization 814
9.3.2 page tables 814
9.3.3 page hits 816
9.3.4 page faults 816
9.3.5 allocating pages 817
9.3.6 locality to the rescue again 818
9.4 vm as a tool for memory management 819
9.5 vm as a tool for memory protection 820
9.6 address translation 821
9.6.1 integrating caches and vm 825
9.6.2 speeding up address translation with a tlb 825
9.6.3 multi-level page tables 826
9.6.4 putting it together: end-to-end address translation 828
9.7 case study: the intel core i7/linux memory system 833
contents 31
9.7.1 core i7 address translation 834
9.7.2 linux virtual memory system 837
9.8 memory mapping 841
9.8.1 shared objects revisited 841
9.8.2 the fork function revisited 843
9.8.3 the execve function revisited 844
9.8.4 user-level memory mapping with the mmap function 844
9.9 dynamic memory allocation 846
9.9.1 the malloc and free functions 848
9.9.2 why dynamic memory allocation? 850
9.9.3 allocator requirements and goals 851
9.9.4 fragmentation 853
9.9.5 implementation issues 854
9.9.6 implicit free lists 854
9.9.7 placing allocated blocks 856
9.9.8 splitting free blocks 857
9.9.9 getting additional heap memory 857
9.9.10 coalescing free blocks 858
9.9.11 coalescing with boundary tags 858
9.9.12 putting it together: implementing a simple allocator 861
9.9.13 explicit free lists 869
9.9.14 segregated free lists 870
9.10 garbage collection 872
9.10.1 garbage collector basics 873
9.10.2 mark&sweep garbage collectors 874
9.10.3 conservative mark&sweep for c programs 876
9.11 common memory-related bugs in c programs 877
9.11.1 dereferencing bad pointers 877
9.11.2 reading uninitialized memory 877
9.11.3 allowing stack buffer overflows 878
9.11.4 assuming that pointers and the objects they point to are the same size 878
9.11.5 making off-by-one errors 879
9.11.6 referencing a pointer instead of the object it points to 879
9.11.7 misunderstanding pointer arithmetic 880
9.11.8 referencing nonexistent variables 880
9.11.9 referencing data in free heap blocks 881
9.11.10 introducing memory leaks 881
9.12 summary 882
bibliographic notes 882
homework problems 883
solutions to practice problems 887
32 contents
part iii interaction and communication between
programs
10 system-level i/o 895
10.1 unix i/o 896
10.2 opening and closing files 897
10.3 reading and writing files 899
10.4 robust reading and writing with the rio package 901
10.4.1 rio unbuffered input and output functions 901
10.4.2 rio buffered input functions 902
10.5 reading file metadata 907
10.6 sharing files 909
10.7 i/o redirection 911
10.8 standard i/o 913
10.9 putting it together: which i/o functions should i use? 914
10.10 summary 915
bibliographic notes 916
homework problems 916
solutions to practice problems 917
11 network programming 919
11.1 the client-server programming model 920
11.2 networks 921
11.3 the global ip internet 925
11.3.1 ip addresses 927
11.3.2 internet domain names 929
11.3.3 internet connections 933
11.4 the sockets interface 934
11.4.1 socket address structures 935
11.4.2 the socket function 936
11.4.3 the connect function 937
11.4.4 the open_clientfd function 937
11.4.5 the bind function 938
11.4.6 the listen function 939
11.4.7 the open_listenfd function 939
11.4.8 the accept function 941
11.4.9 example echo client and server 942
contents 33
11.5 web servers 945
11.5.1 web basics 945
11.5.2 web content 946
11.5.3 http transactions 948
11.5.4 serving dynamic content 950
11.6 putting it together: the tiny web server 953
11.7 summary 961
bibliographic notes 962
homework problems 962
solutions to practice problems 963
12 concurrent programming 967
12.1 concurrent programming with processes 969
12.1.1 a concurrent server based on processes 970
12.1.2 pros and cons of processes 971
12.2 concurrent programming with i/o multiplexing 973
12.2.1 a concurrent event-driven server based on i/omultiplexing 976
12.2.2 pros and cons of i/o multiplexing 980
12.3 concurrent programming with threads 981
12.3.1 thread execution model 982
12.3.2 posix threads 982
12.3.3 creating threads 984
12.3.4 terminating threads 984
12.3.5 reaping terminated threads 985
12.3.6 detaching threads 985
12.3.7 initializing threads 986
12.3.8 a concurrent server based on threads 986
12.4 shared variables in threaded programs 988
12.4.1 threads memory model 989
12.4.2 mapping variables to memory 990
12.4.3 shared variables 990
12.5 synchronizing threads with semaphores 991
12.5.1 progress graphs 994
12.5.2 semaphores 997
12.5.3 using semaphores for mutual exclusion 998
12.5.4 using semaphores to schedule shared resources 1000
12.5.5 putting it together: a concurrent server based on prethreading 1004
12.6 using threads for parallelism 1008
34 contents
12.7 other concurrency issues 1013
12.7.1 thread safety 1013
12.7.2 reentrancy 1014
12.7.3 using existing library functions in threaded programs 1016
12.7.4 races 1017
12.7.5 deadlocks 1019
12.8 summary 1022
bibliographic notes 1023
homework problems 1023
solutions to practice problems 1028
a error handling 1033
a.1 error handling in unix systems 1034
a.2 error-handling wrappers 1035
references 1039
index 1045
前言回到顶部↑
我们的目的是解释所有计算机系统的本质概念,并向你展示这些概念是如何实实在在地影响应用程序的正确性、性能和实用性的。其他的系统类书籍都是从构建者的角度来写的,讲述如何实现硬件或是系统软件,包括操作系统、编译器和网络接口。而本书是从程序员的角度来写的,讲述应用程序员如何能够利用系统知识来编写出更好的程序。当然,学习一个计算机系统应该做些什么,是学习如何构建一个计算机系统的很好的出发点,所以,对于希望继续学习系统软硬件实现的人来说,本书也是一本很有价值的介绍性读物。
本书概述
本书由12 章组成,旨在阐述计算机系统的核心概念。
·第1 章:计算机系统漫游。这一章通过研究“hello, world”这个简单程序的生命周期,介绍计算机系统的主要概念和主题。
·第2 章:信息的表示和处理。我们讲述了计算机的算术运算,重点描述了会对程序员有影响的无符号数和数的二进制补码(two’s complement)表示的特性。我们考虑数字是如何表示的,以及由此确定对于一个给定的字长,其可能编码值的范围。我们讨论该如何表示数字,以及因此用给定的字长能编码的数值的范围。我们探讨有符号和无符号数字之间类型转换的效果,还阐述算术运算的数学特性。菜鸟级程序员经常很惊奇地了解到(用二进制补码表示的)两个正数的和或者积可能为负。另一方面,二进制补码的算术运算满足代数环的特性,因此,编译器可以很安全地把一个常量乘法转化为一系列的移位和加法。我们用C 语言的位级操作来说明布尔代数的原理和应用。我们从两个方面讲述了IEEE 标准的浮点格式:一是如何用它来表示数值,一是浮点运算的数学属性。
·第3 章:程序的机器级表示。我们教读者如何阅读由C 编译器生成的IA32 和x86-64 汇编语言。我们说明为不同控制结构,比如条件、循环和开关语句,生成的基本指令模式。我们还讲述过程的执行,包括栈分配、寄存器使用惯例和参数传递。我们讨论不同数据结构(如结构、联合(union)和数组)的分配和访问方式。我们还以分析程序在机器级的样子作为途径,来理解常见的代码安全漏洞,例如,缓冲区溢出,以及理解程序员、编译器和操作系统可以采取的减轻这些威胁的措施。
·第4 章:处理器体系结构。这一章讲述基本的组合和时序逻辑元素,并展示这些元素如何在数据通路(datapath)中组合到一起来执行IA32 指令集的一个称为“Y86”的简化子集。本章中处理器设计的控制逻辑是用一种称为HCL 的简单硬件描述语言来描述的。用HCL 写的硬件设计能够编译和链接到本书提供的模拟器中,还可以根据这些设计生成Verilog 描述,它适合合成(synthesis)到实际可以运行的硬件上去。
·第5 章:优化程序性能。在这一章里,我们介绍了许多提高代码性能的技术,主要思想就是让程序员通过使编译器能够生成更有效的机器代码来学习编写C 代码。
·第6 章:存储器层次结构。对应用程序员来说,存储器系统是计算机系统中最直接可见的部分之一。我们讲述不同类型的随机存取存储器(RAM)和只读存储器(ROM),以及磁盘和固态硬盘的几何形状和组织构造。我们描述这些存储设备是如何放置在层次结构中的,讲述访问局部性是如何使这种层次结构成为可能的。我们通过一个独特的观点使这些理论具体化、形象化,那就是将存储器系统视为一个“存储器山”,山脊是时间局部性,而斜坡是空间局部性。最后,我们向读者阐述如何通过改善程序的时间局部性和空间局部性来提高应用程序的性能。
·第7 章:链接。本章讲述静态和动态链接,包括的概念有可重定位的(relocatable)和可执行的目标文件、符号解析、重定位(relocation)、静态库、共享目标库,以及与位置无关的代码。
·第8 章:异常控制流。在本书的这个部分,我们通过介绍异常控制流(比如,除了正常分支和过程调用以外的控制流的变化)的一般概念,打破单一程序的模型。我们给出存在于系统所有层次的异常控制流的例子,从底层的硬件异常和中断,到并发进程的上下文切换,到由于Unix 信号传送引起的控制流突变,到C 语言中破坏栈原则的非本地跳转(nonlocal jump)。
·第9 章:虚拟存储器。我们讲述虚拟存储器系统是希望读者对它是如何工作的以及它的特性有所了解。我们想让读者了解为什么不同的并发进程各自都有一个完全相同的地址范围,能共享某些页,而又独占另外一些页。我们还覆盖讲了一些管理和操纵虚拟存储器的问题。
特别地,我们讨论了存储分配操作,就像Unix 的malloc 和free 操作。
·第10 章:系统级I/O。我们讲述Unix I/O 的基本概念,例如文件和描述符。我们描述如何共享文件,I/O 重定向是如何工作的,还有如何访问文件的元数据。我们还开发了一个健壮的带缓冲区的I/O 包,可以正确处理一种称为short counts 的奇特行为,也就是库函数只读取一部分的输入数据。我们阐述C 的标准I/O 库,以及它与Unix I/O 的关系,重点谈到标准I/O 的局限性,这些局限性使之不适合网络编程。
·第11 章:网络编程。对编程而言,网络是非常有趣的I/O 设备,将许多我们前面文中学习的概念,比如进程、信号、字节顺序(byte order)、存储器映射和动态存储器分配,联系在一起。网络程序还为下一章的主题—并发,提供了一个很令人信服的上下文。本章只是网络编程的一个很小的部分,使读者能够编写一个Web 服务器。我们还讲述了位于所有网络程序底层的客户端- 服务器模型。我们展现了一个程序员对Internet 的观点,并且教读者如何用套接字(socket)接口来编写Internet 客户端和服务器。最后,我们介绍超文本传输协议HTTP,并开发了一个简单的迭代式(iterative)Web 服务器。
·第12 章:并发编程。这一章以Internet 服务器设计为例介绍了并发编程。我们比较对照了三种编写并发程序的基本机制(进程、I/O 多路复用技术和线程),并且展示如何用它们来建造并发Internet 服务器。我们探讨了用P、V 信号操作来实现同步、线程安全和可重入5(reentrancy)、竞争条件以及死锁等的基本原则。我们还讲述了线程级编程的使用方法,来解释应用程序中的并行性,使得程序在多核的处理器上能执行得更快。
本版新增内容
本书的第1 版于2003 年出版。考虑到计算机技术发展如此迅速,这本书的内容还算是保持得很好。事实证明Intel x86 的机器上运行类Unix 操作系统,加上采用C 语言编程,是一种能够涵盖当今许多系统的组合。硬件技术和编译器的变化,以及很多教师教授这些内容的经验,都促使我们做了大量的修改。
下面列出的是一些更加详细的改进:
·第3 章:程序的机器级表示。我们将内容的覆盖范围扩展到了包括x86-64,也就是将x86处理器扩展到了64 位字长。也使用了更新版本的GCC 产生的代码。另外还增强了对缓冲区溢出漏洞的描述。在网络旁注里,我们给出了两类不同的浮点指令,还介绍了当编译器试图做更高等级优化的时候,做的一些奇特的变换。另外,还有一个网络旁注描述了如何在一个C 语言程序中嵌入x86 汇编代码。
·第4 章:处理器体系结构。更加详细地说明了我们的处理器设计中的异常发现和处理。在网络旁注里,我们也给出了处理器设计的Verilog 描述映射,使得我们的设计能够合成到可运行的硬件上。
·第5 章:优化程序性能。我们极大地改变了对乱序处理器如何运行的描述,还提出了一种简单的技术,能够基于程序的数据流图表示中的路径来分析程序的性能。在网络旁注里,描述了C 语言程序员如何能够利用较新的x86 处理器中提供的SIMD(单指令流,多数据流)指令来编程。
·第6 章:存储器层次结构。我们增加了固态硬盘的内容,还更新了我们的表述,使之基于Intel Core i7 处理器的存储器层次结构。
·第7 章:链接。本章的变化不大。
·第8 章:异常控制流。我们改进了对于进程模型如何引入一些基本的并发概念的讨论,例如非确定性。
·第9 章:虚拟存储器。我们更新了存储器系统案例研究,采用了64 位Intel Core i7 处理器为例来讲述。我们还更新了malloc 函数的示例实现, 使之既能在32 位也能在64 位环境中执行。
·第10 章:系统级I/O。本章的变化不大。
·第11 章:网络编程。本章的变化不大。
·第12 章:并发编程。我们增加了关于并发性一般原则的内容,还讲述了程序员如何利用线程级并行性使得程序在多核机器上能运行得更快。
此外,我们还增加和修改了很多练习题和家庭作业。
6 This book (CS:APP) is for computer scientists, computer engineers, and others who want to be able to write better programs by learning what is going on “under the hood” of a computer system.
Our aim is to explain the enduring concepts underlying all computer systems, and to show you the concrete ways that these ideas affect the correctness, performance, and utility of your application programs. Other systems books are written from a builder’s perspective, describing how to implement the hardware or the systems software, including the operating system, compiler, and network interface.
This book is written from a programmer’s perspective, describing how application programmers can use their knowledge of a system to write better programs. Of course, learning what a system is supposed to do provides a good first step in learning how to build one, and so this book also serves as a valuable introduction to those who go on to implement systems hardware and software.
If you study and learn the concepts in this book, you will be on your way to becoming the rare “power programmer” who knows how things work and how to fix them when they break. Our aim is to present the fundamental concepts in ways that you will find useful right away.You will also be prepared to delve deeper, studying such topics as compilers, computer architecture, operating systems, embedded systems, and networking.
Assumptions about the Reader’s Background The presentation of machine code in the book is based on two related formats supported by Intel and its competitors, colloquially known as “x86.” IA32 is the machine code that has become the de facto standard for a wide range of systems.
x86-64 is an extension of IA32 to enable programs to operate on larger data and to reference a wider range of memory addresses. Since x86-64 systems are able to run IA32 code, both of these forms of machine code will see widespread use for the foreseeable future.We consider how these machines execute C programs on Unix or Unix-like (such as Linux) operating systems. (To simplify our presentation, we will use the term “Unix” as an umbrella term for systems having Unix as their heritage, including Solaris, Mac OS, and Linux.) The text contains numerous programming examples that have been compiled and run on Linux systems. We assume that you have access to such a machine and are able to log in and do simple things such as changing directories.
If your computer runs Microsoft Windows, you have two choices. First, you can get a copy of Linux (www.ubuntu.com) and install it as a “dual boot” option, so that your machine can run either operating system. Alternatively, by installing a copy of the Cygwin tools (www.cygwin.com), you can run a Unix-like shell under Computer Systems Preface
A Programmer’s Perspective, 2E
8 Preface
Windows and have an environment very close to that provided by Linux. Not all features of Linux are available under Cygwin, however.
We also assume that you have some familiarity with C or C++. If your only prior experience is with Java, the transition will require more effort on your part, but we will help you. Java and C share similar syntax and control statements.
However, there are aspects of C, particularly pointers, explicit dynamic memory allocation, and formatted I/O, that do not exist in Java. Fortunately, C is a small language, and it is clearly and beautifully described in the classic “K&R” text by Brian Kernighan and Dennis Ritchie [58]. Regardless of your programming background, consider K&R an essential part of your personal systems library.
Several of the early chapters in the book explore the interactions between C programs and their machine-language counterparts. The machine-language examples were all generated by the GNU gcc compiler running on IA32 and x86-64 processors. We do not assume any prior experience with hardware, machine language, or assembly-language programming.
New to C? Advice on the C programming language To help readers whose background in C programming is weak (or nonexistent), we have also included these special notes to highlight features that are especially important in C.We assume you are familiar with C++ or Java.
How to Read the Book
Learning how computer systems work from a programmer’s perspective is great fun, mainly because you can do it actively. Whenever you learn something new, you can try it out right away and see the result first hand. In fact, we believe that the only way to learn systems is to do systems, either working concrete problems or writing and running programs on real systems.
This theme pervades the entire book. When a new concept is introduced, it is followed in the text by one or more practice problems that you should work immediately to test your understanding. Solutions to the practice problems are at the end of each chapter. As you read, try to solve each problem on your own, and then check the solution to make sure you are on the right track. Each chapter is followed by a set of homework problems of varying difficulty. Your instructor has the solutions to the homework problems in an Instructor’s Manual. For each homework problem, we show a rating of the amount of effort we feel it will require:
◆ Should require just a few minutes. Little or no programming required.
◆◆ Might require up to 20 minutes. Often involves writing and testing some code.
Many of these are derived from problems we have given on exams.
◆◆◆ Requires a significant effort, perhaps 1–2 hours. Generally involves writing and testing a significant amount of code.
◆◆◆◆ A lab assignment, requiring up to 10 hours of effort.
Preface 9
code/intro/hello.c
1 #include [stdio.h]
2
3 int main()
4 {
5 printf("hello, world\n");
6 return 0;
7 }
code/intro/hello.c
Figure 1 A typical code example.
Each code example in the text was formatted directly, without any manual intervention, from a C program compiled with gcc and tested on a Linux system. Of course, your system may have a different version of gcc, or a different compiler altogether, and so your compiler might generate different machine code, but the overall behavior should be the same. All of the source code is available from the CS:APPWeb page at csapp.cs.cmu.edu. In the text, the file names of the source programs are documented in horizontal bars that surround the formatted code. For example, the program in Figure 1 can be found in the file hello.c in directory code/intro/. We encourage you to try running the example programs on your system as you encounter them.
To avoid having a book that is overwhelming, both in bulk and in content, we have created a number of Web asides containing material that supplements the main presentation of the book. These asides are referenced within the book with a notation of the form CHAP:TOP, where CHAP is a short encoding of the chapter subject, and TOP is short code for the topic that is covered. For example, Web Aside data:bool contains supplementary material on Boolean algebra for the presentation on data representations in Chapter 2, whileWeb Aside arch:vlog contains material describing processor designs using theVerilog hardware description language, supplementing the presentation of processor design in Chapter 4. All of theseWeb asides are available from the CS:APP Web page.
Aside What is an aside?
You will encounter asides of this form throughout the text. Asides are parenthetical remarks that give you some additional insight into the current topic. Asides serve a number of purposes. Some are little history lessons. For example, where did C, Linux, and the Internet come from? Other asides are meant to clarify ideas that students often find confusing. For example, what is the difference between a cache line, set, and block? Other asides give real-world examples. For example, how a floating-point error crashed a French rocket, or what the geometry of an actual Seagate disk drive looks like. Finally, some asides are just fun stuff. For example, what is a “hoinky”?
10 Preface
Book Overview
The CS:APP book consists of 12 chapters designed to capture the core ideas in computer systems:
. Chapter 1: A Tour of Computer Systems. This chapter introduces the major ideas and themes in computer systems by tracing the life cycle of a simple “hello, world” program.
. Chapter 2: Representing and Manipulating Information.We cover computer arithmetic, emphasizing the properties of unsigned and two’s-complement number representations that affect programmers.We consider how numbers are represented and therefore what range of values can be encoded for a given word size.We consider the effect of casting between signed and unsigned numbers. We cover the mathematical properties of arithmetic operations. Novice programmers are often surprised to learn that the (two’s-complement) sum or product of two positive numbers can be negative. On the other hand, two’scomplement arithmetic satisfies the algebraic properties of a ring, and hence a compiler can safely transform multiplication by a constant into a sequence of shifts and adds.We use the bit-level operations of C to demonstrate the principles and applications of Boolean algebra.We cover the IEEE floating-point format in terms of how it represents values and the mathematical properties of floating-point operations.
Having a solid understanding of computer arithmetic is critical to writing reliable programs. For example, programmers and compilers cannot replace the expression (x[y) with (x-y [ 0), due to the possibility of overflow. They cannot even replace it with the expression (-y [ -x), due to the asymmetric range of negative and positive numbers in the two’s-complement representation.
Arithmetic overflow is a common source of programming errors and security vulnerabilities, yet few other books cover the properties of computer arithmetic from a programmer’s perspective.
. Chapter 3: Machine-Level Representation of Programs.We teach you how to read the IA32 and x86-64 assembly language generated by a C compiler. We cover the basic instruction patterns generated for different control constructs, such as conditionals, loops, and switch statements. We cover the implementation of procedures, including stack allocation, register usage conventions, and parameter passing. We cover the way different data structures such as structures, unions, and arrays are allocated and accessed. We also use the machine-level view of programs as a way to understand common code security vulnerabilities, such as buffer overflow, and steps that the programmer, the compiler, and the operating system can take to mitigate these threats. Learning the concepts in this chapter helps you become a better programmer, because you will understand how programs are represented on a machine. One certain benefit is that you will develop a thorough and concrete understanding of pointers.
. Chapter 4: Processor Architecture. This chapter covers basic combinational and sequential logic elements, and then shows how these elements can be Preface 11
combined in a datapath that executes a simplified subset of the IA32 instruction set called “Y86.”We begin with the design of a single-cycle datapath. This design is conceptually very simple, but it would not be very fast.We then introduce pipelining, where the different steps required to process an instruction are implemented as separate stages. At any given time, each stage can work on a different instruction. Our five-stage processor pipeline is much more realistic.
The control logic for the processor designs is described using a simple hardware description language called HCL. Hardware designs written inHCL can be compiled and linked into simulators provided with the textbook, and they can be used to generate Verilog descriptions suitable for synthesis into working hardware.
. Chapter 5: Optimizing Program Performance.This chapter introduces a number of techniques for improving code performance, with the idea being that programmers learn to write their C code in such a way that a compiler can then generate efficient machine code. We start with transformations that reduce the work to be done by a program and hence should be standard practice when writing any program for any machine.We then progress to transformations that enhance the degree of instruction-level parallelism in the generated machine code, thereby improving their performance on modern “superscalar” processors. To motivate these transformations, we introduce a simple operational model of how modern out-of-order processors work, and show how to measure the potential performance of a program in terms of the critical paths through a graphical representation of a program. You will be surprised how much you can speed up a program by simple transformations of the C code.
. Chapter 6: The Memory Hierarchy.The memory system is one of the most visible parts of a computer system to application programmers. To this point, you have relied on a conceptual model of the memory system as a linear array with uniform access times. In practice, a memory system is a hierarchy of storage devices with different capacities, costs, and access times. We cover the different types ofRAMand ROMmemories and the geometry and organization of magnetic-disk and solid-state drives. We describe how these storage devices are arranged in a hierarchy. We show how this hierarchy is made possible by locality of reference. We make these ideas concrete by introducing a unique view of a memory system as a “memory mountain” with ridges of temporal locality and slopes of spatial locality. Finally, we show you how to improve the performance of application programs by improving their temporal and spatial locality.
. Chapter 7: Linking. This chapter covers both static and dynamic linking, including the ideas of relocatable and executable object files, symbol resolution, relocation, static libraries, shared object libraries, and position-independent code. Linking is not covered in most systems texts, but we cover it for several reasons. First, some of the most confusing errors that programmers can encounter are related to glitches during linking, especially for large software packages. Second, the object files produced by linkers are tied to concepts such as loading, virtual memory, and memory mapping.
12 Preface
. Chapter 8: Exceptional Control Flow. In this part of the presentation, we step beyond the single-program model by introducing the general concept of exceptional control flow (i.e., changes in control flow that are outside the normal branches and procedure calls). We cover examples of exceptional control flow that exist at all levels of the system, from low-level hardware exceptions and interrupts, to context switches between concurrent processes, to abrupt changes in control flow caused by the delivery of Unix signals, to the nonlocal jumps in C that break the stack discipline.
This is the part of the book where we introduce the fundamental idea of a process, an abstraction of an executing program. You will learn how processes work and how they can be created and manipulated from application programs. We show how application programmers can make use of multiple processes via Unix system calls.When you finish this chapter, you will be able to write a Unix shell with job control. It is also your first introduction to the nondeterministic behavior that arises with concurrent program execution.
. Chapter 9: Virtual Memory. Our presentation of the virtual memory system seeks to give some understanding of how it works and its characteristics. We want you to know how it is that the different simultaneous processes can each use an identical range of addresses, sharing some pages but having individual copies of others.We also cover issues involved in managing and manipulating virtual memory. In particular, we cover the operation of storage allocators such as the Unix malloc and free operations. Covering this material serves several purposes. It reinforces the concept that the virtual memory space is just an array of bytes that the program can subdivide into different storage units. It helps you understand the effects of programs containing memory referencing errors such as storage leaks and invalid pointer references. Finally, many application programmers write their own storage allocators optimized toward the needs and characteristics of the application. This chapter, more than any other, demonstrates the benefit of covering both the hardware and the software aspects of computer systems in a unified way. Traditional computer architecture and operating systems texts present only part of the virtual memory story.
. Chapter 10: System-Level I/O.We cover the basic concepts of Unix I/O such as files and descriptors.We describe how files are shared, how I/O redirection works, and how to access file metadata.We also develop a robust buffered I/O package that deals correctly with a curious behavior known as short counts, where the library function reads only part of the input data. We cover the C standard I/O library and its relationship to Unix I/O, focusing on limitations of standard I/O that make it unsuitable for network programming. In general, the topics covered in this chapter are building blocks for the next two chapters on network and concurrent programming.
. Chapter 11: Network Programming. Networks are interesting I/O devices to program, tying together many of the ideas that we have studied earlier in the text, such as processes, signals, byte ordering, memory mapping, and dynamic Preface 13
storage allocation. Network programs also provide a compelling context for concurrency, which is the topic of the next chapter. This chapter is a thin slice through network programming that gets you to the point where you can write a Web server. We cover the client-server model that underlies all network applications.We present a programmer’s view of the Internet, and show how to write Internet clients and servers using the sockets interface. Finally, we introduce HTTP and develop a simple iterativeWeb server.
. Chapter 12: Concurrent Programming. This chapter introduces concurrent programming using Internet server design as the running motivational example.
We compare and contrast the three basic mechanisms for writing concurrent programs—processes, I/O multiplexing, and threads—and show how to use them to build concurrent Internet servers.We cover basic principles of synchronization using P and V semaphore operations, thread safety and reentrancy, race conditions, and deadlocks. Writing concurrent code is essential for most server applications. We also describe the use of thread-level programming to express parallelism in an application program, enabling faster execution on multi-core processors. Getting all of the cores working on a single computational problem requires a careful coordination of the concurrent threads, both for correctness and to achieve high performance.
New to this Edition
The first edition of this book was published with a copyright of 2003. Considering the rapid evolution of computer technology, the book content has held up surprisingly well. Intel x86 machines running Unix-like operating systems and programmed in C proved to be a combination that continues to encompass many systems today. Changes in hardware technology and compilers and the experience of many instructors teaching the material have prompted a substantial revision. Here are some of the more significant changes:
. Chapter 2: Representing and Manipulating Information.We have tried to make this material more accessible, with more careful explanations of concepts and with many more practice and homework problems. We moved some of the more theoretical aspects to Web asides. We also describe some of the security vulnerabilities that arise due to the overflow properties of computer arithmetic.
. Chapter 3: Machine-Level Representation of Programs.We have extended our coverage to include x86-64, the extension of x86 processors to a 64-bit word size.We also use the code generated by a more recent version of gcc.We have enhanced our coverage of buffer overflow vulnerabilities. We have created Web asides on two different classes of instructions for floating point, and also a view of the more exotic transformations made when compilers attempt higher degrees of optimization. Another Web aside describes how to embed x86 assembly code within a C program.
14 Preface
. Chapter 4: Processor Architecture.We include a more careful exposition of exception detection and handling in our processor design. We have also created a Web aside showing a mapping of our processor designs into Verilog, enabling synthesis into working hardware.
. Chapter 5: Optimizing Program Performance.We have greatly changed our description of how an out-of-order processor operates, and we have created a simple technique for analyzing program performance based on the paths in a data-flow graph representation of a program. A Web aside describes how C programmers can write programs that make use of the SIMD (singleinstruction, multiple-data) instructions found in more recent versions of x86 processors.
. Chapter 6: The Memory Hierarchy. We have added material on solid-state disks, and we have updated our presentation to be based on the memory hierarchy of an Intel Core i7 processor.
. Chapter 7: Linking.This chapter has changed only slightly.
. Chapter 8: Exceptional Control Flow. We have enhanced our discussion of howthe process model introduces some fundamental concepts of concurrency, such as nondeterminism.
. Chapter 9:Virtual Memory.We have updated our memory system case study to describe the 64-bit Intel Core i7 processor.We have also updated our sample implementation of malloc to work for both 32-bit and 64-bit execution.
. Chapter 10: System-Level I/O.This chapter has changed only slightly.
. Chapter 11: Network Programming.This chapter has changed only slightly.
. Chapter 12: Concurrent Programming.We have increased our coverage of the general principles of concurrency, and we also describe how programmers can use thread-level parallelism to make programs run faster on multi-core machines.
In addition, we have added and revised a number of practice and homework problems.
Origins of the Book
The book stems from an introductory course that we developed at Carnegie Mellon University in the Fall of 1998, called 15-213: Introduction to Computer Systems (ICS) [14]. The ICS course has been taught every semester since then, each time to about 150–250 students, ranging from sophomores to masters degree students and with a wide variety of majors. It is a required course for all undergraduates in the CS and ECE departments at Carnegie Mellon, and it has become a prerequisite for most upper-level systems courses.
The idea with ICS was to introduce students to computers in a different way. Few of our students would have the opportunity to build a computer system. On the other hand, most students, including all computer scientists and computer engineers, will be required to use and program computers on a daily basis. So we Preface 15
decided to teach about systems from the point of view of the programmer, using the following filter: we would cover a topic only if it affected the performance, correctness, or utility of user-level C programs.
For example, topics such as hardware adder and bus designs were out. Topics such as machine language were in, but instead of focusing on how to write assembly language by hand, we would look at how a C compiler translates C constructs into machine code, including pointers, loops, procedure calls, and switch statements. Further, we would take a broader and more holistic view of the system as both hardware and systems software, covering such topics as linking, loading, processes, signals, performance optimization, virtual memory, I/O, and network and concurrent programming.
This approach allowed us to teach the ICS course in a way that is practical, concrete, hands-on, and exciting for the students. The response from our students and faculty colleagues was immediate and overwhelmingly positive, and we realized that others outside of CMU might benefit from using our approach. Hence this book, which we developed from the ICS lecture notes, and which we have now revised to reflect changes in technology and how computer systems are implemented.
For Instructors: Courses Based on the Book
Instructors can use the CS:APP book to teach five different kinds of systems courses (Figure 2). The particular course depends on curriculum requirements, personal taste, and the backgrounds and abilities of the students. From left to right in the figure, the courses are characterized by an increasing emphasis on the programmer’s perspective of a system. Here is a brief description: . ORG: A computer organization course with traditional topics covered in an untraditional style. Traditional topics such as logic design, processor architecture, assembly language, and memory systems are covered. However, there is more emphasis on the impact for the programmer. For example, data representations are related back to the data types and operations of C programs, and the presentation on assembly code is based on machine code generated by a C compiler rather than hand-written assembly code.
. ORG+: TheORGcourse with additional emphasis on the impact of hardware on the performance of application programs. Compared to ORG, students learn more about code optimization and about improving the memory performance of their C programs.
. ICS: The baseline ICS course, designed to produce enlightened programmers who understand the impact of the hardware, operating system, and compilation system on the performance and correctness of their application programs.
Asignificant difference fromORG+is that low-level processor architecture is not covered. Instead, programmers work with a higher-level model of a modern out-of-order processor. The ICS course fits nicely into a 10-week quarter, and can also be stretched to a 15-week semester if covered at a more leisurely pace.
16 Preface
Course
Chapter Topic ORG ORG+ ICS ICS+ SP
1 Tour of systems .. .. .
2 Data representation .. .. (d)
3 Machine language .. .. .
4 Processor architecture ..
5 Code optimization .. .
6 Memory hierarchy (a) ... (a)
7 Linking (c) (c) .
8 Exceptional control flow ...
9 Virtual memory (b) .. ..
10 System-level I/O ..
11 Network programming ..
12 Concurrent programming ..
Figure 2 Five systems courses based on the CS:APP book. Notes: (a) Hardware only, (b) No dynamic storage allocation, (c) No dynamic linking, (d) No floating point. ICS+ is the 15-213 course from Carnegie Mellon.
. ICS+: The baseline ICS course with additional coverage of systems programming topics such as system-level I/O, network programming, and concurrent programming. This is the semester-long Carnegie Mellon course, which covers every chapter in CS:APP except low-level processor architecture.
. SP: A systems programming course. Similar to the ICS+ course, but drops floating point and performance optimization, and places more emphasis on systems programming, including process control, dynamic linking, systemlevel I/O, network programming, and concurrent programming. Instructors might want to supplement from other sources for advanced topics such as daemons, terminal control, and Unix IPC.
The main message of Figure 2 is that the CS:APP book gives a lot of options to students and instructors. If you want your students to be exposed to lowerlevel processor architecture, then that option is available via theORGand ORG+ courses. On the other hand, if you want to switch from your current computer organization course to an ICS or ICS+ course, but are wary are making such a drastic change all at once, then you can move toward ICS incrementally. You can start with ORG, which teaches the traditional topics in a nontraditional way. Once you are comfortable with that material, then you can move to ORG+, and eventually to ICS. If students have no experience in C (for example they have only programmed in Java), you could spend several weeks on C and then cover the material of ORG or ICS.
Preface 17
Finally, we note that the ORG+ and SP courses would make a nice two-term (either quarters or semesters) sequence. Or you might consider offering ICS+ as one term of ICS and one term of SP.
Classroom-Tested Laboratory Exercises
The ICS+ course at Carnegie Mellon receives very high evaluations from students. Median scores of 5.0/5.0 and means of 4.6/5.0 are typical for the student course evaluations. Students cite the fun, exciting, and relevant laboratory exercises as the primary reason. The labs are available from the CS:APPWeb page. Here are examples of the labs that are provided with the book:
. Data Lab. This lab requires students to implement simple logical and arithmetic functions, but using a highly restricted subset of C. For example, they must compute the absolute value of a number using only bit-level operations.
This lab helps students understand the bit-level representations of C data types and the bit-level behavior of the operations on data.
. Binary Bomb Lab. A binary bomb is a program provided to students as an object-code file.When run, it prompts the user to type in six different strings.
If any of these is incorrect, the bomb “explodes,” printing an error message and logging the event on a grading server. Students must “defuse” their own unique bombs by disassembling and reverse engineering the programs to determine what the six strings should be. The lab teaches students to understand assembly language, and also forces them to learn how to use a debugger.
. Buffer Overflow Lab. Students are required to modify the run-time behavior of a binary executable by exploiting a buffer overflow vulnerability. This lab teaches the students about the stack discipline, and teaches them about the danger of writing code that is vulnerable to buffer overflow attacks.
. Architecture Lab. Several of the homework problems of Chapter 4 can be combined into a lab assignment, where students modify the HCL description of a processor to add new instructions, change the branch prediction policy, or add or remove bypassing paths and register ports. The resulting processors can be simulated and run through automated tests that will detect most of the possible bugs. This lab lets students experience the exciting parts of processor design without requiring a complete background in logic design and hardware description languages.
. Performance Lab. Students must optimize the performance of an application kernel function such as convolution or matrix transposition. This lab provides a very clear demonstration of the properties of cache memories, and gives students experience with low-level program optimization.
. Shell Lab.Students implement their own Unix shell program with job control, including the ctrl-c and ctrl-z keystrokes, fg, bg, and jobs commands.This is the student’s first introduction to concurrency, and gives them a clear idea of Unix process control, signals, and signal handling.
18 Preface
. Malloc Lab. Students implement their own versions of malloc, free, and (optionally) realloc. This lab gives students a clear understanding of data layout and organization, and requires them to evaluate different trade-offs between space and time efficiency.
. Proxy Lab. Students implement a concurrent Web proxy that sits between their browsers and the rest of the World Wide Web. This lab exposes the students to such topics as Web clients and servers, and ties together many of the concepts from the course, such as byte ordering, file I/O, process control, signals, signal handling, memory mapping, sockets, and concurrency. Students like being able to see their programs in action with realWeb browsers andWeb servers.
The CS:APP Instructor’s Manual has a detailed discussion of the labs, as well as directions for downloading the support software.
Acknowledgments for the Second Edition
We are deeply grateful to the many people who have helped us produce this second edition of the CS:APP text.
First and foremost, we would to recognize our colleagues who have taught the ICS course at Carnegie Mellon for their insightful feedback and encouragement: GuyBlelloch, Roger Dannenberg, David Eckhardt, Greg Ganger, Seth Goldstein, Greg Kesden, Bruce Maggs, Todd Mowry, Andreas Nowatzyk, Frank Pfenning, and Markus Pueschel.
Thanks also to our sharp-eyed readers who contributed reports to the errata page for the first edition: Daniel Amelang, Rui Baptista, Quarup Barreirinhas, Michael Bombyk, J ¨ org Brauer, Jordan Brough, Yixin Cao, James Caroll, Rui Carvalho, Hyoung-Kee Choi, Al Davis, Grant Davis, Christian Dufour, Mao Fan, Tim Freeman, Inge Frick, Max Gebhardt, Jeff Goldblat, Thomas Gross, Anita Gupta, John Hampton, Hiep Hong, Greg Israelsen, Ronald Jones, Haudy Kazemi, Brian Kell, Constantine Kousoulis, Sacha Krakowiak, Arun Krishnaswamy, Martin Kulas, Michael Li, Zeyang Li, Ricky Liu, Mario Lo Conte, Dirk Maas, Devon Macey, Carl Marcinik, Will Marrero, Simone Martins, Tao Men, Mark Morrissey, Venkata Naidu, Bhas Nalabothula, Thomas Niemann, Eric Peskin, David Po, Anne Rogers, John Ross, Michael Scott, Seiki, Ray Shih, Darren Shultz, Erik Silkensen, Suryanto, Emil Tarazi, Nawanan Theera-Ampornpunt, Joe Trdinich, Michael Trigoboff, James Troup, Martin Vopatek, Alan West, Betsy Wolff, Tim Wong, JamesWoodruff, Scott Wright, Jackie Xiao, Guanpeng Xu, Qing Xu, Caren Yang, Yin Yongsheng, Wang Yuanxuan, Steven Zhang, and Day Zhong. Special thanks to Inge Frick, who identified a subtle deep copy bug in our lock-and-copy example, and to Ricky Liu, for his amazing proofreading skills.
Our Intel Labs colleagues Andrew Chien and Limor Fix were exceptionally supportive throughout the writing of the text. Steve Schlosser graciously provided some disk drive characterizations. Casey Helfrich and Michael Ryan installed and maintained our new Core i7 box. Michael Kozuch, Babu Pillai, and Jason Campbell provided valuable insight on memory system performance, multi-core Preface 19
systems, and the power wall. Phil Gibbons and Shimin Chen shared their considerable expertise on solid-state disk designs.
We have been able to call on the talents of many, including Wen-Mei Hwu, Markus Pueschel, and Jiri Simsa, to provide both detailed comments and highlevel advice. James Hoe helped us create a Verilog version of the Y86 processor and did all of the work needed to synthesize working hardware.
Many thanks to our colleagues who provided reviews of the draft manuscript: James Archibald (Brigham Young University), Richard Carver (George Mason University), Mirela Damian (Villanova University), Peter Dinda (Northwestern University), John Fiore (Temple University), Jason Fritts (St. Louis University), John Greiner (Rice University), Brian Harvey (University of California, Berkeley), Don Heller (Penn State University), Wei Chung Hsu (University of Minnesota), Michelle Hugue (University of Maryland), Jeremy Johnson (Drexel University), Geoff Kuenning (Harvey Mudd College), Ricky Liu, Sam Madden (MIT), Fred Martin (University of Massachusetts, Lowell), Abraham Matta (Boston University), Markus Pueschel (Carnegie Mellon University), Norman Ramsey (Tufts University), Glenn Reinmann (UCLA), Michela Taufer (University of Delaware), and Craig Zilles (UIUC).
Paul Anagnostopoulos of Windfall Software did an outstanding job of typesetting the book and leading the production team. Many thanks to Paul and his superb team: Rick Camp (copyeditor), Joe Snowden (compositor), MaryEllen N. Oliver (proofreader), Laurel Muller (artist), and Ted Laux (indexer).
Finally, we would like to thank our friends at Prentice Hall. Marcia Horton has always been there for us. Our editor Matt Goldstein provided stellar leadership from beginning to end.We are profoundly grateful for their help, encouragement, and insights.
Acknowledgments from the First Edition
We are deeply indebted to many friends and colleagues for their thoughtful criticisms and encouragement. A special thanks to our 15-213 students, whose infectious energy and enthusiasm spurred us on. Nick Carter and Vinny Furia generously provided their malloc package.
Guy Blelloch, GregKesden, Bruce Maggs, and Todd Mowry taught the course over multiple semesters, gave us encouragement, and helped improve the course material. Herb Derby provided early spiritual guidance and encouragement. Allan Fisher, Garth Gibson, Thomas Gross, Satya, Peter Steenkiste, and Hui Zhang encouraged us to develop the course from the start. A suggestion from Garth early on got the whole ball rolling, and this was picked up and refined with the help of a group led by Allan Fisher. Mark Stehlik and Peter Lee have been very supportive about building this material into the undergraduate curriculum. Greg Kesden provided helpful feedback on the impact of ICS on the OS course. Greg Ganger and Jiri Schindler graciously provided some disk drive characterizations and answered our questions on modern disks. Tom Stricker showed us the memory mountain. James Hoe provided useful ideas and feedback on how to present processor architecture.
20 Preface
A special group of students—Khalil Amiri, Angela Demke Brown, Chris Colohan, Jason Crawford, Peter Dinda, Julio Lopez, Bruce Lowekamp, Jeff Pierce, Sanjay Rao, Balaji Sarpeshkar, Blake Scholl, Sanjit Seshia, Greg Steffan, Tiankai Tu, Kip Walker, and Yinglian Xie—were instrumental in helping us develop the content of the course. In particular, Chris Colohan established a fun (and funny) tone that persists to this day, and invented the legendary “binary bomb” that has proven to be a great tool for teaching machine code and debugging concepts.
Chris Bauer, Alan Cox, Peter Dinda, Sandhya Dwarkadas, John Greiner, Bruce Jacob, Barry Johnson, Don Heller, Bruce Lowekamp, Greg Morrisett, Brian Noble, Bobbie Othmer, Bill Pugh, Michael Scott, Mark Smotherman, Greg Steffan, and Bob Wier took time that they did not have to read and advise us on early drafts of the book. A very special thanks to Al Davis (University of Utah), Peter Dinda (Northwestern University), John Greiner (Rice University),Wei Hsu (University of Minnesota), Bruce Lowekamp (College of William & Mary), Bobbie Othmer (University of Minnesota), Michael Scott (University of Rochester), and Bob Wier (Rocky Mountain College) for class testing the Beta version. A special thanks to their students as well!
We would also like to thank our colleagues at Prentice Hall. Marcia Horton, Eric Frank, and Harold Stone have been unflagging in their support and vision.
Harold also helped us present an accurate historical perspective on RISC and CISC processor architectures. Jerry Ralya provided sharp insights and taught us a lot about good writing.
Finally, we would like to acknowledge the great technical writers Brian Kernighan and the late W. Richard Stevens, for showing us that technical books can be beautiful.
Thank you all.
Randy Bryant
Dave O’Hallaron
Pittsburgh, Pennsylvania
媒体评论回到顶部↑
“本书表述清晰、恰到好处——举重若轻地呈现了那些非常复杂的内容。” —— Ibrahim Matta, 波士顿大学
“这是一本学习计算机硬件和软件如何‘真正’协同工作的好书,还教会你为什么了解这些知识会使你成为一个更有价值的程序员。本书还帮你为学习像操作系统和编译器这样的高级课程做好准备。在本书中,我最喜欢的章节是关于缓存的,当我第一次发现缓存有多重要时,真是难以置信!” —— Vishal Shah,Ask.com总架构师








点击看大图








加载中...

