网站运维(英文影印版)
基本信息
- 作者: John Allspaw Jesse Robbins
- 丛书名: 南京东南大学出版社O'Reilly系列
- 出版社:东南大学出版社
- ISBN:9787564125028
- 上架时间:2011-3-31
- 出版日期:2011 年1月
- 开本:16开
- 页码:315
- 版次:1-1
- 所属分类:
计算机 > 计算机网络 > Web Server > 综合
内容简介回到顶部↑
网络应用牵涉到很多专业人土,而网站运维人员必须确保应用的每一部分在其整个生命周期中都能正常工作。当初创公司遭遇了未曾预期的访问流量尖峰,或者当某个新特性导致成熟应用失效时,你就需要这样的专业知识。在这部文章和访谈集中,网站运维老手theo schlossnagle、baron schwartz和alistair croll向这个日新月异的领域提供了他们的真知灼见。你还将学到如何使网站蓬勃发展的秘诀,这是来自·最大规模网站建设者的第一手资料。
·学习网站运维技能,了解这些技巧来自于经验而非学校教育的原因
·理解为何从应用程序和基础设施收集统计数据都很重要
·为数据库架构和规模日益增长带来的隐患考虑通用的处理方法
·学习如何处理宕机和降级相关的人为因素
·找到在蜂拥而至的巨大流量后避免灾难的方法
·问题发生后了解症结所在,防止其再次发生
·学习网站运维技能,了解这些技巧来自于经验而非学校教育的原因
·理解为何从应用程序和基础设施收集统计数据都很重要
·为数据库架构和规模日益增长带来的隐患考虑通用的处理方法
·学习如何处理宕机和降级相关的人为因素
·找到在蜂拥而至的巨大流量后避免灾难的方法
·问题发生后了解症结所在,防止其再次发生
目录回到顶部↑
《网站运维(英文影印版)》
foreword
preface
1 web operations: the career
theo schlossnagle
why does web operations have it tough?
from apprentice to master
conclusion
2 how picnik uses cloud computing: lessons learned
justin huff
where the cloud fits (and why!)
where the cloud doesn't fit (for picnik)
conclusion
3 infrastructure and application metrics
john aiispaw, with matt massie
time resolution and retention concerns
locality of metrics collection and storage
layers of metrics
providing context for anomaly detection and alerts
log lines are metrics, too
foreword
preface
1 web operations: the career
theo schlossnagle
why does web operations have it tough?
from apprentice to master
conclusion
2 how picnik uses cloud computing: lessons learned
justin huff
where the cloud fits (and why!)
where the cloud doesn't fit (for picnik)
conclusion
3 infrastructure and application metrics
john aiispaw, with matt massie
time resolution and retention concerns
locality of metrics collection and storage
layers of metrics
providing context for anomaly detection and alerts
log lines are metrics, too
前言回到顶部↑
DESIGNING, BUILDING, AND MAINTAINING A GROWING WEBSITE has unique challenges when it comes to the fields of systems administration and software development. For one, the Web never sleeps. Because websites are globally used, there is no "good" time for changes, upgrades, or maintenance windows, only fewer "bad" times. This also means that outages are guaranteed to affect someone, somewhere using the site, no matter what time it is.
As web applications become an increasing part of our daily lives, they are also becoming more complex. With that complexity comes more parts to build and maintain and, unfortunately, more parts to fail. On top of that, there are requirements for being fast, secure, and always available across the planet. All these things add up to what's become a specialized field of engineering: web operations.
This book was conceived to gather insights into this still-evolving field from web veterans around the industry. Jesse Robbins and I came up with a list of tip-of-iceberg topics and asked these experts for their hard-earned advice and stories from the trenches.
How This Book Is Organized
The chapters in this book are organized as follows:
Chapter 1, Web Operations: The Career by Theo Schlossnagle, describes what this field actually encompasses and underscores how the skills needed are gained by experience and less about formal education.
Chapter 2, How Picnik Uses Cloud Computing: Lessons Learned by Justin Huff, explains how Picnik.com went about deploying and sustaining its infrastructure on a mix of on-premise hardware and cloud services.
Chapter 3, Infrastructure and Application Metrics by Matt Massie and myself, discusses the importance of gathering metrics from both your application and your infrastructure, and considerations on how to gather them.
Chapter 4, Continuous Deployment by Eric Ries, gives his take on the advantages of deploying code to production in small batches, frequently.
Chapter 5, Infrastructure as Code by Adam Jacob, gives an overview about the theory and approaches for configuration and deployment management.
Chapter 6, Monitoring by Patrick Debois, discusses the various considerations when designing a monitoring system.
Chapter 7, How Complex Systems Fail, is Dr. Richard Cook's whitepaper on systems failure and the nature of complexity that is often found in web architectures. He also adds some web operations-specific notes to his original paper.
Chapter 8, Community Management and Web Operations, is my interview with Heather Champ on the topic of how outages and degradations should be handled on the human side of things.
Chapter 9, Dealing with Unexpected Traffic Spikes by Brian Moon, talks about the experiences with huge traffic deluges at Dealnews.com and what they did to mitigate disaster.
Chapter 10, Der and Ops Collaboration and Cooperation by Paul Hammond, lists some of the places where development and operations can come together to enable the business, both technically and culturally.
Chapter 11, How Your Visitors Feel: User-Facing Metrics by Alistair Croll and Sean Power, discusses metrics that can be used to illustrate what the real experience of your site is.
Chapter 12, Relational Database Strategy and Tactics for the Web by Baron Schwartz, lays out common approaches to database architectures and some pitfalls that come with increasing scale.
Chapter 13, HOW to Make Failure Beautiful: The Art and Science of Postmortems by Jake Loomis, goes into what makes or breaks a good postmortem and root cause analysis process.
Chapter 14, Storage by Anoop Nagwani, explores the gamut of approaches and considerations when designing and maintaining storage for a growing web application.
C'hapter 15, Nonrelational Databases by Eric Florenzano, lists considerations and advantages of using a growing number of "nonrelational" database technologies.
As web applications become an increasing part of our daily lives, they are also becoming more complex. With that complexity comes more parts to build and maintain and, unfortunately, more parts to fail. On top of that, there are requirements for being fast, secure, and always available across the planet. All these things add up to what's become a specialized field of engineering: web operations.
This book was conceived to gather insights into this still-evolving field from web veterans around the industry. Jesse Robbins and I came up with a list of tip-of-iceberg topics and asked these experts for their hard-earned advice and stories from the trenches.
How This Book Is Organized
The chapters in this book are organized as follows:
Chapter 1, Web Operations: The Career by Theo Schlossnagle, describes what this field actually encompasses and underscores how the skills needed are gained by experience and less about formal education.
Chapter 2, How Picnik Uses Cloud Computing: Lessons Learned by Justin Huff, explains how Picnik.com went about deploying and sustaining its infrastructure on a mix of on-premise hardware and cloud services.
Chapter 3, Infrastructure and Application Metrics by Matt Massie and myself, discusses the importance of gathering metrics from both your application and your infrastructure, and considerations on how to gather them.
Chapter 4, Continuous Deployment by Eric Ries, gives his take on the advantages of deploying code to production in small batches, frequently.
Chapter 5, Infrastructure as Code by Adam Jacob, gives an overview about the theory and approaches for configuration and deployment management.
Chapter 6, Monitoring by Patrick Debois, discusses the various considerations when designing a monitoring system.
Chapter 7, How Complex Systems Fail, is Dr. Richard Cook's whitepaper on systems failure and the nature of complexity that is often found in web architectures. He also adds some web operations-specific notes to his original paper.
Chapter 8, Community Management and Web Operations, is my interview with Heather Champ on the topic of how outages and degradations should be handled on the human side of things.
Chapter 9, Dealing with Unexpected Traffic Spikes by Brian Moon, talks about the experiences with huge traffic deluges at Dealnews.com and what they did to mitigate disaster.
Chapter 10, Der and Ops Collaboration and Cooperation by Paul Hammond, lists some of the places where development and operations can come together to enable the business, both technically and culturally.
Chapter 11, How Your Visitors Feel: User-Facing Metrics by Alistair Croll and Sean Power, discusses metrics that can be used to illustrate what the real experience of your site is.
Chapter 12, Relational Database Strategy and Tactics for the Web by Baron Schwartz, lays out common approaches to database architectures and some pitfalls that come with increasing scale.
Chapter 13, HOW to Make Failure Beautiful: The Art and Science of Postmortems by Jake Loomis, goes into what makes or breaks a good postmortem and root cause analysis process.
Chapter 14, Storage by Anoop Nagwani, explores the gamut of approaches and considerations when designing and maintaining storage for a growing web application.
C'hapter 15, Nonrelational Databases by Eric Florenzano, lists considerations and advantages of using a growing number of "nonrelational" database technologies.
序言回到顶部↑
IT'S BEEN OVER A DECADE SINCE THE FIRST WEBSITES REACHED REAL SCALE.
We were there then, in those early days, watching our sites growing faster than anyone had seen before or knew how to manage. It was up to us figure out how to keep everything running, to make things happen, to get things done.
While everyone else was at the launch party, we were deep in the bowels of the datacenter racking and stacking the last servers. Then we sat at our desks late into the night, our faces lit with the glow of logfiles and graphs streaming by.
Our experiences were universal: Our software crashed or couldn't scale. The databases crashed and data was corrupted, while every server, disk, and switch failed in ways the manufacturer absolutely, positively said it wouldn't. Hackers attacked--first for fun and then for profit. And just when we got things working again, a new feature would be pushed out, traffic would spike, and everything would break all over again.
In the early days, we used what we could find because we had no budget. Then we grew from mismatched, scavenged machines hidden in closets to megawatt-scale datacenters spanning the globe filled with the cheapest machines we could find.
As we got to scale, we had to deal with the real world and its many dangers. Our datacenters caught fire, flooded, or were ripped apart by hurricanes. Our power failed. Generators didn't kick in--or started and then ran out of fuel--or were taken down when someone hit the Emergency Power Off. Cooling failed. Sprinklers leaked. Fiber was cut by backhoes and squirrels and strange creatures crawling along the seafioor.
Man, machine, and Mother Nature challenged us in every way imaginable and then surprised us in ways we never expected.
We worked from the instant our pagers woke us up or when a friend innocently inquired, "Is the site down?" or when the CEO called scared and furious. We were always the first ones to know it was down and the last to leave when it was back up again.
Always.
Every day we got a little smarter, a little wiser, and learned a few more tricks. The scripts we wrote a decade ago have matured into tools and languages of their own, and whole industries have emerged around what we do. The knowledge, experiences, tools, and processes are growing into an art we call Web Operations.
We say that Web Operations is an art, not a science, for a reason. There are no standards, certifications, or formal schooling (at least not yet). What we do takes a long time to learn and longer to master, and everyone at every skill level must find his or her own style. There's no "right way," only what works (for now) and a commitment to doing it even better next time.
The Web is changing the way we live and touches every person alive. As more and more people depend on the Web, they depend on us.
Web Operations is work that matters.
--Jesse Robbins
The contributors to this book have donated their payments to the 826 Foundation, which helps kids learn to love reading at places like the Superhero Supply Company, the Greenwood Space Travel Supply Company, and the Liberty Street Robot Supply & Repair Shop.
We were there then, in those early days, watching our sites growing faster than anyone had seen before or knew how to manage. It was up to us figure out how to keep everything running, to make things happen, to get things done.
While everyone else was at the launch party, we were deep in the bowels of the datacenter racking and stacking the last servers. Then we sat at our desks late into the night, our faces lit with the glow of logfiles and graphs streaming by.
Our experiences were universal: Our software crashed or couldn't scale. The databases crashed and data was corrupted, while every server, disk, and switch failed in ways the manufacturer absolutely, positively said it wouldn't. Hackers attacked--first for fun and then for profit. And just when we got things working again, a new feature would be pushed out, traffic would spike, and everything would break all over again.
In the early days, we used what we could find because we had no budget. Then we grew from mismatched, scavenged machines hidden in closets to megawatt-scale datacenters spanning the globe filled with the cheapest machines we could find.
As we got to scale, we had to deal with the real world and its many dangers. Our datacenters caught fire, flooded, or were ripped apart by hurricanes. Our power failed. Generators didn't kick in--or started and then ran out of fuel--or were taken down when someone hit the Emergency Power Off. Cooling failed. Sprinklers leaked. Fiber was cut by backhoes and squirrels and strange creatures crawling along the seafioor.
Man, machine, and Mother Nature challenged us in every way imaginable and then surprised us in ways we never expected.
We worked from the instant our pagers woke us up or when a friend innocently inquired, "Is the site down?" or when the CEO called scared and furious. We were always the first ones to know it was down and the last to leave when it was back up again.
Always.
Every day we got a little smarter, a little wiser, and learned a few more tricks. The scripts we wrote a decade ago have matured into tools and languages of their own, and whole industries have emerged around what we do. The knowledge, experiences, tools, and processes are growing into an art we call Web Operations.
We say that Web Operations is an art, not a science, for a reason. There are no standards, certifications, or formal schooling (at least not yet). What we do takes a long time to learn and longer to master, and everyone at every skill level must find his or her own style. There's no "right way," only what works (for now) and a commitment to doing it even better next time.
The Web is changing the way we live and touches every person alive. As more and more people depend on the Web, they depend on us.
Web Operations is work that matters.
--Jesse Robbins
The contributors to this book have donated their payments to the 826 Foundation, which helps kids learn to love reading at places like the Superhero Supply Company, the Greenwood Space Travel Supply Company, and the Liberty Street Robot Supply & Repair Shop.







点击看大图

加载中...

