18 Tradeoffs of Replaceing Core Components

18 代替核心组件的权衡

There's a lot of hype around swapping out core parts of Django's stack for other pieces. Should you do it?

一直以来,有很多围绕着替换掉 Django 核心部件的炒作。你是不是该这么做呢?

Short Answer: Don't do it. These days, even Instagram says on Forbes.com that it's completely unnecessary: http://2scoops.co/instagram-insights.

长话短说:不要。现在,连 Instagram 也在 Forbes.com 上说完全没这个必要: http://2scoops.co/instagram-insights

Long Answer: It's certainly possible, since Django modules are simply just Python modules. Is it worth it? Well, it's worth it only if:

细细道来:确实可能会有这个需要,毕竟Django 模块也不过是 Python 的模块而已。那这样做值得吗?额,只有少数情况下,是值得的:

  • You are okay with sacrificing some or all of your ability to use third-party Django packages.
  • You have no problem giving up the powerful Django admin.
  • You have already made a determined effort to build your project with core Django components, but you running into walls that are major blockers.
  • You have already analyzed your own code to find and fix the root causes of your problems. For example, you've done all the work you can to reduce the numbers of the queries made in your templates.
  • You've explored all other options including caching, denormalization, etc.
  • Your project is a real, live production site with tons of users. In other words, you're certain that you're not just optimizing prematurely.
  • You've looked at and rejected adopting a Service Oriented Approach (SOA) for those cases Django has problems dealing with.
  • You're willing to accept the fact that upgrading Django will be extremely painful or impossible going forward.

That doesn't sound so great anymore, does it?

  • 就算是要耗费一丢丢或者全部精力去使用Django的第三方包,你也觉得还好。
  • 不能使用Django强大的管理后台对你来说算不了什么。
  • 你下定决心并且努力要利用Django核心组件来开发自己的项目,却遇到了无法绕过去的拦路虎。
  • 你已经仔细分析过自己的代码,尝试去发现和解决自己的问题。例如,你已经做了所有的努力去减少模版中的数据查询。
  • 你已经试过了所有的办法,包括使用缓存、反规范化等。
  • 你的项目是在真实的生产环境中,并且有超级多的用户。换句话说,你确定自己不是在做多余的优化。
  • 对于那些 Django 没办法解决的问题,你已经查询过相关的面向服务架构(SOA),但是不愿使用。
  • 如果这么做了,升级 Django 会变成一件极度痛苦的事,甚至可能导致项目无法继续,而你对此也欣然接受。

看到这里,这件事似乎并没有那么酷了,对吗?

18.1 The Temptation to Build FrankenDjango

18.1 打造怪胎 Django 的诱惑

Every year, a new fad leads waves of developers to replace some particular core Django component. Here's a summary of some of the fads we've seen come and go.

每年都会有一股新的潮流鼓动着开发者们去替换掉某个 Django 组件。下面是其中一些。

Fad Reasons
For performance reasons, replacing the database/ORM with a NoSQL database and corresponding ORM replacement. Not okay: "I have an idea for a social network for ice cream haters. I just started building it last month. I need it to be web-scale!!!!!" Okay: "Our site has 50M users and I'm hitting the limits of what I can do with indexes, query optimization, caching, etc. We're also pushing the limits of our Postgres cluster. I've done a lot of research on this and am going to try storing a simple denormalized view of data in Cassandra to see if it helps. I'm aware of the CAP theorem (http://www.2scoops.co/CAP-theorem/), and for this view, eventual consistency is fine."
For data processing reasons, replacing the database/ORM with a NoSQL database and corresponding ORM replacement. Not okay: "SQL Sucks! We're going with a document-oriented database like MongoDB!" Okay: "While PostgreSQL's HSTORE datatype replicates nearly every aspect of MongoDB's data storage system, we want to use MongoDB's built-in MapReduce functionality."
Replacing Django's template engine with Jinja2, Mako, or something else. Not okay: "I read on Hacker News that Jinja2 is faster. I don't know anything about caching or optimization, but I need Jinja2!" Not okay: "I hate having logic in Python modules. I want logic in my templates!" Okay: "I have a small number of views which generate 1MB+ HTML pages designed for Google to index. I'll use Django's native support for multiple template languages to render the 1MB+ sized pages with Jinja2, and serve the rest with Django Template Language."

Table 18.1: Fad-based Reasons to Replace Components of Django

潮流 理由
为了更好的性能,用非关系型数据库和相关的ORM替换 Django的数据库和ORM。 不好:我想要做一个冰淇淋抵制者的社交网络。上个月我开始做了。我要做成网络级的。</br>好:我们的站点有5万用户,我已经试过了用索引、优化查询、缓存,但是都没有用。我们已经在挑战Posgres集群的极限。我在这方面做了很多功课,想要把Cassandra的数据保存在反规范化的视图当中,看看有没有帮助。我了解CAP理论(http://www.2scoops.co/CAP-theorem/),而且这个视图,最后稳定性还不错。
为了数据处理,用非关系型数据库和相关的ORM替换 Django的数据库和ORM。 不好:垃圾SQL!我们要用一个基于文件的数据库,像是MongoDB!
好:虽然Posgres的HSTORE数据类型几乎复制了MongoDB数据存储系统的各个方面,但我们想要用MongoDB内置的 MapReduce 功能。
用 Jinja2、Mako或者其他模板替换Django的模版引擎。 不好:我在 Hacker News 上面看到 Jinja2 更加快。我不懂缓存或者优化,我就需要 Jinja2 !
不好:我讨厌把逻辑放在python模块当中,我想要在模板中写逻辑。
好:我有为数不多的一些视图,用来生成 1MB 多的 HTML 的页面来让 Google 去做索引。我会利用 Django 对多模板语言的支持,用 Jinja2 来渲染生成那些 1MB 多的页面,而其他部分依旧是用 Django的模板语言。

Table 18.1: Fad-based Reasons to Replace Components of Django

表18.1:潮流角度看换掉 Django 组件的那些理由

Figure 18.1: Replacing more core components of cake with ice cream seems like a good idea. Which cake would win? The one on the right, of course.

18.2 Non-Relational Databases vs Relational Databases

18.2 非关系型数据库 vs 关系型数据库

Even Django projects that use relational databases for persistent data storage rely on non-relational databases. If a project relies on tools like Memcached for caching and Redis for queuing, then it’s using non-relational databases.

即便是使用关系型数据库来存储长期数据的 Django 项目也有依赖非关系型数据库的地方。如果一个项目依靠 Memcached 和 Redis 这样的工具来做缓存和队列管理的工具,那么它就是在使用非关系型数据库了。

The problem occurs when NoSQL solutions are used to completely replace Django’s relational database functionality without considering in-depth the long-term implications.

但是,如果全面地用非关系型数据库的方案来代替 Django 的关系数据库,而不认真地考虑长久如此可能带来的结果,那就有问题了。

18.2.1 Not All Non-Relational Databases Are ACID Compliant

18.2.1 不是所有的非关系型数据库都符合 ACID 原则

ACID is an acronym for:

Atomicity means that all parts of a transaction work or it all fails. Without this, you risk data corruption.

Consistency means that any transaction will keep data in a valid state. Strings remain strings and integers remain integers. Without this, you risk data corruption.

Isolation means that concurrent execution of data within a transaction will not collide or leak into another transaction. Without this, you risk data corruption.

Durability means that once a transaction is committed, it will remain so even if the database server is shut down. Without this, you risk data corruption.

ACID 是首字母缩略,拆开来看是这样:

原子性 事务当中的所有操作要么全部完成,要么全部失败。没有它,你的数据就有损坏的风险。

一致性 事务必须保证数据一致。字符串依然是字符串,数字依然是数字。没有它,你的数据就有损坏的风险。

隔离性 同一事物中并发的数据操作不会冲突,也不会影响到其他事务。没有它,你的数据就有损坏的风险。

持久性 在事务完成以后,该事务对数据库所作的更改便持久的保存在数据库之中,即使数据库服务器关机。没有它,你的数据就有损坏的风险。

Did you notice how each of those descriptions ended with ‘Without this, you risk data corruption'? This is because in the case of most NoSQL engines, there is little-to-no mechanism for ACID compliance. It’s much easier to corrupt the data, which is mostly a non-issue for things like caching but another thing altogether for projects handling processing of persistent medical or e-commerce data.

注意到了吗?上面的每一条都以“没有它,你的数据就有损坏的风险”来结束。这是因为多数的非关系型数据库引擎都几乎没有满足 ACID 原则的机制。这样会更容易造成数据损坏,这样的损坏通常对于缓存这样的事情来说算不上什么,但是对于处理医疗或者电子商务类的持久数据的项目来说就是另一回事了。

18.2.2 Don't Use Non-Relational Databases for Relational Tasks

18.2.2 关系型的任务就不要用非关系型数据库

Imagine if we were to use a non-relational database to track the sale of properties, property owners, and how property laws worked for them in 50 US states. There a lot of unpredictable details, so wouldn’t a schemaless datastore be perfect for this task?

假设我们要用一个非关系型数据库来记录物业的出售,物业拥有人,还有 50 个州的物业法在这方面的规定。可以想象,这当中有特别多不可预知的细节,那么,无模式的数据存储方式不是恰到好处吗?

Perhaps...

也许吧……

We would need to track the relationship between properties, property owners, and laws of 50 states. Our Python code would have to maintain the referential integrity between all the components. We would also need to ensure that the right data goes into the right place.

我们还需要记录物业、物业拥有人和 50 个洲的法律之间的关系。我们的 Python 代码需要维护所有要素之间的引用完整性。我们还要保证正确的数据被放到正确的地方。

For a task like this, stick with a relational database.

所以,对于这样的任务,坚持用关系型数据库就对了。

18.2.3 Ignore the Hype and Do Your Own Research

18.2.3 别跟风 做足功课

It’s often said that non-relational databases are faster and scale better than relational databases. Whether or not this is true, don’t blindly swallow the marketing hype of the companies behind any particular alternative database solution.

人们常说,相比关系型数据库,非关系型数据库更快,更具有伸缩性。不管是不是真的,别盲目听信那些公司关于任何某种特定数据库解决方案的市场炒作。

Instead, do as we do: search for benchmarks, read case studies describing when things went right or wrong, and form opinions as independently as possible.

你应该像我们这样做:搜寻衡量标准,读些成功或者失败案例的研究,尽可能独立地形成自己的观点。

Also, experiment with unfamiliar NoSQL databases on small hobby side projects before you make major changes to your main project infrastructure. Your main codebase is not a playground.

Lessons learned by companies and individuals:

18.2.4 What About Replace the Django Template Language?

This is how we prefer to do things:

  • If we use a non-relational data store, limit usage to short-term things like caches, queues, and sometimes denormalized data. But avoid it if possible, to reduce the number of moving parts.
  • Use relational data stores for long-term, relational data and sometimes denormalized data (PostgreSQL’s array and HStore elds work great for this task).

For us, this is the sweet spot that makes our Django projects shine.

18.3 What About Replacing the Django Template Language?

We advocate the practice of sticking entirely to the Django Template Language (DTL) with the exception of rendered content of huge size. However, as this use case is now covered by Django’s native support of alternate template systems, we’ve moved discussion of this topic to chapter 15, Using Alternate Template Systems.

18.4 Summary

Always use the right tool for the right job. We prefer to go with stock Django components, just like we prefer using a scoop when serving ice cream. However, there are times when other tools make sense.

Just don’t follow the fad of mixing vegetables into your ice cream. You simply can’t replace the classic strawberry, chocolate, and vanilla with supposedly "high-performance" flavors such as broccoli, corn, and spinach. That’s taking it too far.

results matching ""

    No results matching ""