博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
泊松分布–计算概率分布的公式
阅读量:2524 次
发布时间:2019-05-11

本文共 6170 字,大约阅读时间需要 20 分钟。

Probability Distributions play an important role in our daily lives. We commonly use them when trying to summarise and gain insights from different forms of data.

概率分布在我们的日常生活中起着重要作用。 在尝试总结不同形式的数据并从中获取见解时,我们通常使用它们。

Because of this, they're quite an important topic in fields such as Mathematics, Computer Science, Statistics, and Data Science.

因此,它们是数学,计算机科学,统计和数据科学等领域的重要主题。

There are two main types of data: Numerical (for example integers and floats), and Categorical (for example strings of text).

数据有两种主要类型: 数值 (例如整数和浮点数)和分类 (例如文本字符串)。

Numerical data can also be in either of two forms:

数值数据也可以采用以下两种形式之一:

  • Discrete: this form of data can just take a limited number of values (like the number of clothes we have). We can infer probability mass functions from discrete data.

    离散的:这种形式的数据只能接受有限数量的值(例如我们拥有的衣服数量)。 我们可以从离散数据推断概率质量函数。

  • Continuous: on the other hand, continuous data is used to describe more abstract concepts such as weight/distance which can take any fractional or real value. From continuous data we can instead infer probability density functions.

    连续的:另一方面,连续的数据用于描述更抽象的概念,例如权重/距离,它可以取任何分数或实数值。 我们可以从连续数据中推断出概率密度函数。

Probability mass functions can give us the probability that a variable is equal to a certain value. On the other hand, the values of probability density functions do not represent probabilities on their own, but instead first need to be integrated (within the considered range).

概率质量函数可以为我们提供变量等于某个值的概率。 另一方面,概率密度函数的值本身并不表示概率,而是首先需要积分(在所考虑的范围内)。

什么是泊松分布? (What is a Poisson Distribution?)

Poisson Distributions are commonly used for two main purposes:

泊松分布通常用于两个主要目的:

  • Predicting how many times an event will take place within a chosen time period. This technique can be used for different risk analysis applications such as house insurance price estimation.

    预测事件在选定时间段内将发生多少次。 该技术可用于不同的风险分析应用,例如房屋保险价格估计。
  • Estimating a probability that an event might occur given how often it happened in the past (for example how likely it is that there will be a power-cut in the next two months).

    考虑到事件过去发生的频率,估计事件发生的可能性(例如,未来两个月停电的可能性有多大)。

Poisson Distributions let us be confident of the average time between the occurrence of different events. They can't, however, tell us the precise moment an event might take place (since processes usually have stochastic behaviour).

泊松分布使我们对不同事件发生之间的平均时间充满信心。 但是,他们无法告诉我们事件可能发生的确切时间(因为流程通常具有随机行为)。

线性与非线性系统 (Linear vs non-linear systems)

Natural systems can, in fact, be divided into two main categories: linear and non-linear (stochastic).

实际上,自然系统可以分为两大类: 线性非线性(随机)

In linear systems, causes always precede their effect which creates a strong time precedence effect.

在线性系统中,原因总是先于其结果,从而产生很强的时间优先效应。

But this doesn't instead hold true when talking about non-linear systems, as small changes in the system's initial conditions can lead to unpredictable outcomes.

但这在谈论非线性系统时并不能成立,因为系统初始条件的微小变化会导致不可预测的结果。

Considering how complex and chaotic our real world is, most processes are better described using non-linear systems, although linear approximations are sometimes possible.

考虑到我们现实世界的复杂性和混乱性,使用非线性系统可以更好地描述大多数过程,尽管有时可以进行线性近似。

Poisson Distributions can be modeled using the expression in the figure below, where λ is used to represent the expected number of events which can take place in the considered time-span.

可以使用下图中的表达式对泊松分布建模,其中λ用于 表示在考虑的时间跨度内可能发生的预期事件数。

The main characteristics which describe Poisson Processes are:

描述泊松过程的主要特征是:

  1. Two events can't take place simultaneously.

    两个事件不能同时发生。
  2. The average rate between event occurrence is overall constant.

    事件发生之间的平均速率总体恒定。
  3. Events are independent of each other (if one happens, this does not have any influence on the probability that another event might take place).

    事件彼此独立(如果一个事件发生,则不会对另一事件发生的可能性产生任何影响)。
  4. Events can take place any number of times (within the considered time-span).

    事件可以发生任意次(在所考虑的时间跨度内)。

泊松分布的一个例子 (An example of a Poisson Distribution)

In the figure below, you can see how varying the expected number of events (λ) which can take place in a period can change a Poisson Distribution. The image below has been simulated, making use of this Python code:

在下图中,您可以看到改变一个时期内可能发生的事件数(λ)如何改变泊松分布。 下面的图像已使用此Python代码进行了模拟:

import numpy as npimport matplotlib.pyplot as pltimport scipy.stats as stats# n = number of events, lambd = expected number of events # which can take place in a periodfor lambd in range(2, 12, 2):    n = np.arange(0, 9)    poisson = stats.poisson.pmf(n, lambd)    plt.plot(n, poisson, '-o', label="λ = {:f}".format(lambd))    plt.xlabel('Number of Events', fontsize=12)    plt.ylabel('Probability', fontsize=12)    plt.title("Poisson Distribution varying λ")    plt.legend()    plt.savefig('name.png')

Taking a closer look to this simulation, we can discover the following patterns:

仔细研究此模拟,我们可以发现以下模式:

  • In each of the different cases, the number assigned to λ corresponds to the peak of the distribution, which then trails off moving further away from the peak.

    在每种不同情况下,分配给λ的数字对应于分布的峰值,然后逐渐远离峰值。
  • The more events that are expected to take place during the simulation, the greater the expected area under the distribution curve will be.

    在模拟过程中预期发生的事件越多,分布曲线下的预期面积将越大。

This type of simulation could, for example, be used to try to reduce the queuing time when going shopping to a supermarket.

例如,可以使用这种类型的模拟来尝试减少去超市购物时的排队时间。

The owner could create a record of how many customers visit the store at different times and on different days of the week in order to then fit this data to a Poisson Distribution.

所有者可以创建一个记录,记录有多少顾客在一周的不同时间和一周中的不同日期访问该商店,然后将该数据拟合到泊松分布中。

In this way, it would be much easier to determine how many cashiers should be working at different times of the day/week in order to enhance the customer experience.

这样,确定一天/一周的不同时间应有多少个收银员工作以提高客户体验会容易得多。

结语 (Wrapping up)

In case you are interested in learning more about the applications of distributions in stochastic settings, more information is available .

如果您有兴趣了解更多有关随机环境中分布的应用的信息,请获取更多信息。

I hope you enjoyed this article, thank you for reading!

希望您喜欢这篇文章,感谢您的阅读!

联络我 (Contact me)

If you want to keep updated with my latest articles and projects and subscribe to my . These are some of my contacts details:

如果您想随时了解我的最新文章和项目,请并订阅我的 。 这些是我的一些联系方式:

翻译自:

转载地址:http://tmgwd.baihongyu.com/

你可能感兴趣的文章
iOS 9音频应用播放音频之播放控制暂停停止前进后退的设置
查看>>
Delphi消息小记
查看>>
HNOI2016
查看>>
JVM介绍
查看>>
将PHP数组输出为HTML表格
查看>>
Java中的线程Thread方法之---suspend()和resume() 分类: ...
查看>>
经典排序算法回顾:选择排序,快速排序
查看>>
BZOJ2213 [Poi2011]Difference 【乱搞】
查看>>
一道关于员工与部门查询的SQL笔试题
查看>>
Canvas基础
查看>>
[Hive - LanguageManual] Alter Table/Partition/Column
查看>>
可持久化数组
查看>>
去除IDEA报黄色/灰色的重复代码的下划波浪线
查看>>
Linux发送qq、网易邮件服务配置
查看>>
几道面试题
查看>>
【转】使用 WebGL 进行 3D 开发,第 1 部分: WebGL 简介
查看>>
js用正则表达式控制价格输入
查看>>
chromium浏览器开发系列第三篇:chromium源码目录结构
查看>>
java开发操作系统内核:由实模式进入保护模式之32位寻址
查看>>
第五讲:单例模式
查看>>