Jekyll2019-06-23T16:17:54+00:00https://czgdp1807.github.io/feed.xmlOpenBlogI blog about my experience with open source community.Gagandeep SinghWeek 4 - Phase 1 Ends2019-06-23T00:00:00+00:002019-06-23T00:00:00+00:00https://czgdp1807.github.io/week_4<p>So, with the fourth week, the first phase of my journey with <code class="highlighter-rouge">SymPy</code> has come to an end. This blog post summarises the work done till now, in terms of PRs. Moreover, I will share with you the plans for phase 2.</p>
<p>I worked on the following PRs(listed in chronological order) during the first phase, many of them got merged and few are open.</p>
<p><strong>Merged</strong></p>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16528">#16528</a> : I worked on extending the <code class="highlighter-rouge">GumbelDistribution</code> to support both minimum and maximum versions of it.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16576">#16576</a>: This PR added <code class="highlighter-rouge">Dirichlet</code> and <code class="highlighter-rouge">MultivariteEwens</code> distributions.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16808">#16808</a> : This PR added <code class="highlighter-rouge">Multinomial</code> and <code class="highlighter-rouge">NegativeMultinomial</code> distribution.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16810">#16810</a> : This PR improved the API of <code class="highlighter-rouge">Sum</code> by allowing <code class="highlighter-rouge">Range</code> as the limits.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16825">#16825</a> : This PR in continuation, added <code class="highlighter-rouge">GeneralizedMultivariateLogGamma</code> distribution. This was an interesting one due to the complexity involved in its PDF.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16834">#16834</a> : This PR enhanced the <code class="highlighter-rouge">Multinomial</code> and <code class="highlighter-rouge">NegativeMultinomial</code> distributions by allowing symbolic dimensions for them.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16897">#16897</a> : This was related to <code class="highlighter-rouge">sympy.core</code> and it helped in removing disparity in the results of special function <code class="highlighter-rouge">gamma</code>.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16899">#16899</a> : This was a workflow related to PR to ignore the <code class="highlighter-rouge">.vscode</code> folder.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16908">#16908</a> : This PR improved <code class="highlighter-rouge">sympy.stats.frv</code> by allowing conditions with foriegn symbols.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16913">#16913</a> : This removed the unreachable code from <code class="highlighter-rouge">sympy.stats.frv</code>.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16914">#16914</a> : This PR allowed symbolic dimensions to <code class="highlighter-rouge">MultivariateEwens</code> distribution.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16929">#16929</a> : This one was for the <code class="highlighter-rouge">sympy.tensor</code> module. It optimized the <code class="highlighter-rouge">ArrayComprehension</code> and covered some corner cases.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16981">#16981</a> : This PR added the architecture of stochastic processes. It also added discrete Markov chain to <code class="highlighter-rouge">sympy.stats</code>.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17003">#17003</a> : This PR ignored the <code class="highlighter-rouge">__pycahce__</code> folder by adding it <code class="highlighter-rouge">.gitignore</code> file.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17030">#17030</a> : Some features like, <code class="highlighter-rouge">joint_dsitribution</code> were added in this PR.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17046">#17046</a> : Some common properties of discrete Markov chains, like fundamental matrix, fixed row vector were added.</p>
</li>
</ul>
<p><strong>Open</strong></p>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16962">#16962</a> : The aim of this PR is to allow symbolic dimensions to single finite distributions, like <code class="highlighter-rouge">Die</code>, <code class="highlighter-rouge">Binomial</code>. The work from my side is complete on this.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16934">#16934</a> : This PR aims to fix the bugs and test the code introduced in <code class="highlighter-rouge">GSoC</code> 2018.</p>
</li>
</ul>
<p>Apart from the above PRs I also reviewed code written by other contributors.
Overall, according to me phase 1 was a great learning and working experience.</p>
<p>Let me share with you my plan for phase 2. Since, I will be working on random matrices during the upcoming phase, I have started the discussions for design with my mentors and things are taking shape. However, apart from random matrices, I will also work on few more general improvements for my phase 1 work.</p>
<p>The outline is given below,</p>
<ol>
<li>
<p>I will work on extending the scope of queries handeled by discrete Markov chains by covering some uncommon cases.</p>
</li>
<li>
<p>Some work will be done to extend the <code class="highlighter-rouge">DiscreteMarkovChain</code> by adding <code class="highlighter-rouge">ContinuousMarkovChain</code> as the latter is almost same but with some extra parameters.</p>
</li>
<li>
<p>I will implement random matrices and its various Gaussian ensembles according to the conclusion of the design discussions.</p>
</li>
<li>
<p>Last but not the least, I will try to merge my open PRs.</p>
</li>
</ol>
<p>Thanks for reading and see you soon in phase 2. Bye!!</p>Gagandeep SinghSo, with the fourth week, the first phase of my journey with SymPy has come to an end. This blog post summarises the work done till now, in terms of PRs. Moreover, I will share with you the plans for phase 2. I worked on the following PRs(listed in chronological order) during the first phase, many of them got merged and few are open. Merged #16528 : I worked on extending the GumbelDistribution to support both minimum and maximum versions of it. #16576: This PR added Dirichlet and MultivariteEwens distributions. #16808 : This PR added Multinomial and NegativeMultinomial distribution. #16810 : This PR improved the API of Sum by allowing Range as the limits. #16825 : This PR in continuation, added GeneralizedMultivariateLogGamma distribution. This was an interesting one due to the complexity involved in its PDF. #16834 : This PR enhanced the Multinomial and NegativeMultinomial distributions by allowing symbolic dimensions for them. #16897 : This was related to sympy.core and it helped in removing disparity in the results of special function gamma. #16899 : This was a workflow related to PR to ignore the .vscode folder. #16908 : This PR improved sympy.stats.frv by allowing conditions with foriegn symbols. #16913 : This removed the unreachable code from sympy.stats.frv. #16914 : This PR allowed symbolic dimensions to MultivariateEwens distribution. #16929 : This one was for the sympy.tensor module. It optimized the ArrayComprehension and covered some corner cases. #16981 : This PR added the architecture of stochastic processes. It also added discrete Markov chain to sympy.stats. #17003 : This PR ignored the __pycahce__ folder by adding it .gitignore file. #17030 : Some features like, joint_dsitribution were added in this PR. #17046 : Some common properties of discrete Markov chains, like fundamental matrix, fixed row vector were added. Open #16962 : The aim of this PR is to allow symbolic dimensions to single finite distributions, like Die, Binomial. The work from my side is complete on this. #16934 : This PR aims to fix the bugs and test the code introduced in GSoC 2018.Week 3 - Evaluation is coming :P2019-06-16T00:00:00+00:002019-06-16T00:00:00+00:00https://czgdp1807.github.io/week_3<p>Week 3 has ended and we are moving towards phase 1 evaluations by entering the fourth week. Let me tell you what all has been done during the previous week to make the last step smooth.</p>
<p>The basic structure of stochastic process has been added with the merging of the PR <a href="https://github.com/sympy/sympy/pull/16981">#16981</a> and I am currently working on adding more features like, <code class="highlighter-rouge">joint_distribution</code>, <code class="highlighter-rouge">expectation</code> and may be some more according to the suggestions recieved from the community. You can read more about it, <a href="https://github.com/sympy/sympy/pull/17030#issuecomment-502455179">here</a>. So this was about some processes, now lets move to the case of symbolic dimensions. Well, I made a comment and the members showed agreement to that. I have made some updates to the PR <a href="https://github.com/sympy/sympy/pull/16962">#16962</a> and the summary is available at <a href="https://github.com/sympy/sympy/pull/16962#issuecomment-502461264">this comment</a>. I also discovered that docs of <code class="highlighter-rouge">SymPy</code> weren’t being updated at <code class="highlighter-rouge">https://docs.sympy.org</code> for the <code class="highlighter-rouge">stats</code> module and therefore I modified the PR <a href="https://github.com/sympy/sympy/pull/16934">#16934</a>. For preparation of phase 2, I have created the issue <a href="https://github.com/sympy/sympy/issues/17039">#17039</a> for discussion on random matrices.</p>
<p>So, what I learnt? Premature optimization is bad for the health of the code. Yes, while reviewing PRs I learnt that concept. I also got to know why some imports cannot be put at the top of the file, all due to a concept called, <code class="highlighter-rouge">circular imports</code>. Sounds intuitive and faced a lot earlier but wasn’t aware of the exact term. Thanks to <a href="https://github.com/leosartaj">leosartaj</a> for this. I also tried the coverage tests on my local system with the help of <a href="https://github.com/oscarbenjamin">oscarbenjamin</a>. Read <a href="https://github.com/sympy/sympy/pull/16981#issuecomment-502095724">this comment</a>.</p>
<p>That’s it for now, bye!!</p>Gagandeep SinghWeek 3 has ended and we are moving towards phase 1 evaluations by entering the fourth week. Let me tell you what all has been done during the previous week to make the last step smooth. The basic structure of stochastic process has been added with the merging of the PR #16981 and I am currently working on adding more features like, joint_distribution, expectation and may be some more according to the suggestions recieved from the community. You can read more about it, here. So this was about some processes, now lets move to the case of symbolic dimensions. Well, I made a comment and the members showed agreement to that. I have made some updates to the PR #16962 and the summary is available at this comment. I also discovered that docs of SymPy weren’t being updated at https://docs.sympy.org for the stats module and therefore I modified the PR #16934. For preparation of phase 2, I have created the issue #17039 for discussion on random matrices. So, what I learnt? Premature optimization is bad for the health of the code. Yes, while reviewing PRs I learnt that concept. I also got to know why some imports cannot be put at the top of the file, all due to a concept called, circular imports. Sounds intuitive and faced a lot earlier but wasn’t aware of the exact term. Thanks to leosartaj for this. I also tried the coverage tests on my local system with the help of oscarbenjamin. Read this comment. That’s it for now, bye!!Week 2 - A lot of work2019-06-09T00:00:00+00:002019-06-09T00:00:00+00:00https://czgdp1807.github.io/week_2<p>This week involved a lot of work and reviews. I worked on three PRs this week which were started for the purpose of, Markov chains, symbolic dimensions and symbolic conditions for finite RVs. Let me tell you how I managed all this.</p>
<p>The most important and the most demanding PR from all the said topics was that of Markov chain. I told you that we were discussing API for the same. Fortunately the discussion concluded and I started implementation with the starting of the second week, by opeining <a href="https://github.com/sympy/sympy/pull/16981">#16981</a>. Infact, the basic structure is ready and I am vary happy, that the code is able to handle <code class="highlighter-rouge">probability</code> queries and is generating desirable results. Out of curiosity, I started work on symbolic dimensions for helping one of my fellow students. Infact, it’s also going to be completed, see <a href="https://github.com/sympy/sympy/pull/16962">#16962</a>. The work for symbolic conditions which started for the purpose of correcting <code class="highlighter-rouge">@XFAIL</code> tests is also almost complete, take a look at <a href="https://github.com/sympy/sympy/pull/16908">#16908</a>. I have observed that still there are some issues with <code class="highlighter-rouge">sympy.stats.frv</code> which need to be handled in a different set of PRs. Hopefully, I am planning to make some of those this week if time permits. I also aim to add more features to <code class="highlighter-rouge">StochasticProcess</code> like <code class="highlighter-rouge">joint_distribution</code>, and adding more properties to <code class="highlighter-rouge">DiscreteMarkovChain</code> like, whether it’s absorbing, transient. Adding, <code class="highlighter-rouge">ContinuousMarkovChain</code> will also be on my list.</p>
<p>Apart from a lot of work there was a lot of learning too. I got many of my concepts cleared, like, inherit <code class="highlighter-rouge">Basic</code> only when the instance of the class is going to be a part of <code class="highlighter-rouge">SymPy</code> expression. Thanks to <a href="https://github.com/certik">Ondřej Čertík</a> for this. I also realized the worth of <code class="highlighter-rouge">sympy.stats.symbolic_probability</code> while working on symbolic dimensions. Francesco used to suggest that we should use it wherever necessary, but now I completely understand his view.</p>
<p>That’s all, thanks for reading. Bye!!</p>Gagandeep SinghThis week involved a lot of work and reviews. I worked on three PRs this week which were started for the purpose of, Markov chains, symbolic dimensions and symbolic conditions for finite RVs. Let me tell you how I managed all this. The most important and the most demanding PR from all the said topics was that of Markov chain. I told you that we were discussing API for the same. Fortunately the discussion concluded and I started implementation with the starting of the second week, by opeining #16981. Infact, the basic structure is ready and I am vary happy, that the code is able to handle probability queries and is generating desirable results. Out of curiosity, I started work on symbolic dimensions for helping one of my fellow students. Infact, it’s also going to be completed, see #16962. The work for symbolic conditions which started for the purpose of correcting @XFAIL tests is also almost complete, take a look at #16908. I have observed that still there are some issues with sympy.stats.frv which need to be handled in a different set of PRs. Hopefully, I am planning to make some of those this week if time permits. I also aim to add more features to StochasticProcess like joint_distribution, and adding more properties to DiscreteMarkovChain like, whether it’s absorbing, transient. Adding, ContinuousMarkovChain will also be on my list. Apart from a lot of work there was a lot of learning too. I got many of my concepts cleared, like, inherit Basic only when the instance of the class is going to be a part of SymPy expression. Thanks to Ondřej Čertík for this. I also realized the worth of sympy.stats.symbolic_probability while working on symbolic dimensions. Francesco used to suggest that we should use it wherever necessary, but now I completely understand his view. That’s all, thanks for reading. Bye!!Week 1 - All about design, imporvements and bug fixes2019-06-02T00:00:00+00:002019-06-02T00:00:00+00:00https://czgdp1807.github.io/first_week<p>The first week of official coding phase is over and it’s time to tell you about what I did and what I learnt. So here it goes.</p>
<p>I started discussion with my mentor about API design of Markov chain. We have more clarity about the same than before. The final API will be somewhat closer to the one mentioned in <a href="https://github.com/sympy/sympy/issues/16895#issuecomment-497649797">this</a> comment. The discussion about probability space of stochastic process is under way. We will conclude that soon.</p>
<p>I also made some PRs during this week for improving the distributions which I added during the bonding period. The PR <a href="https://github.com/sympy/sympy/pull/16914">#16914</a> allowed the possibility of symbolic dimensions in <code class="highlighter-rouge">MultivariateEwens</code>. We earlier decided that we will use <code class="highlighter-rouge">ArrayComprehension</code> which one of my fellow students developed. However, currently it needs a lot of improvements before being put to use. I have made an attempt to optimise the code of <code class="highlighter-rouge">ArrayComprehension</code> via PR <a href="https://github.com/sympy/sympy/pull/16929">#16929</a>. It also covers some cases which were left during the inital merge. I have also tried to allow symbolic conditions to finite probability spaces. I have two approaches for review. One of them is in the <a href="https://github.com/sympy/sympy/pull/16908/files">diff of PR #16908</a> and the other one is in <a href="https://github.com/sympy/sympy/pull/16908#issuecomment-497950242">this</a> comment. I have also made some minor improvements like changing namespace of <code class="highlighter-rouge">joint_rv_types</code> via PR <a href="https://github.com/sympy/sympy/pull/16919">#16919</a>, enhancement of <code class="highlighter-rouge">P</code> in PR <a href="https://github.com/sympy/sympy/pull/16907">#16907</a> and some general bug fixes in <code class="highlighter-rouge">MarginalDistribution</code> at PR <a href="https://github.com/sympy/sympy/pull/16934">#16934</a>.</p>
<p>Doing all this I learnt a lot of things. The most important was that API design matters the most. Without it, it’s not possible to develop a good class structure. Thanks to Francesco for making me learn this fact. The other thing is, how sometimes even to make small changes in a large code base, it takes hours to get it right. Another fact which I learnt is how small optimisations in a large of code base can improve the performance.</p>
<p>See you next week with some more progress. :)</p>Gagandeep SinghThe first week of official coding phase is over and it’s time to tell you about what I did and what I learnt. So here it goes. I started discussion with my mentor about API design of Markov chain. We have more clarity about the same than before. The final API will be somewhat closer to the one mentioned in this comment. The discussion about probability space of stochastic process is under way. We will conclude that soon. I also made some PRs during this week for improving the distributions which I added during the bonding period. The PR #16914 allowed the possibility of symbolic dimensions in MultivariateEwens. We earlier decided that we will use ArrayComprehension which one of my fellow students developed. However, currently it needs a lot of improvements before being put to use. I have made an attempt to optimise the code of ArrayComprehension via PR #16929. It also covers some cases which were left during the inital merge. I have also tried to allow symbolic conditions to finite probability spaces. I have two approaches for review. One of them is in the diff of PR #16908 and the other one is in this comment. I have also made some minor improvements like changing namespace of joint_rv_types via PR #16919, enhancement of P in PR #16907 and some general bug fixes in MarginalDistribution at PR #16934. Doing all this I learnt a lot of things. The most important was that API design matters the most. Without it, it’s not possible to develop a good class structure. Thanks to Francesco for making me learn this fact. The other thing is, how sometimes even to make small changes in a large code base, it takes hours to get it right. Another fact which I learnt is how small optimisations in a large of code base can improve the performance. See you next week with some more progress. :)Community Bonding - A head start2019-05-27T00:00:00+00:002019-05-27T00:00:00+00:00https://czgdp1807.github.io/bonding-period<p>So, here I am with another set of experiences to share with you, i.e., the Community Bonding of GSoC. After discussing with my fellow developers, and my mentors, we reframed our timeline. I will be working on very interesting topics in statistics, including, joint distributions, Markov chains, random matrices, assumptions of dependence and simplifying the results of the stats module.</p>
<p>For a head start, I started working on adding multivariate distributions and the good news is that I have completed majority of the work, apart from some technical constraints which will be solved by another project very soon. If you are interested in seeing the code, take a look at the PRs <a href="https://github.com/sympy/sympy/pull/16576">#16576</a>, <a href="https://github.com/sympy/sympy/pull/16808">#16808</a>, <a href="https://github.com/sympy/sympy/pull/16825">#16825</a>. Infact, apart from the stats module, I also had an opportunity to work on other modules of SymPy, like, I improved the API of <code class="highlighter-rouge">Sum</code> in <code class="highlighter-rouge">sympy.concrete</code>, allowing <code class="highlighter-rouge">Range</code> to be passed as limits and that too while working on discrete joint distributions. You can check the PR <a href="https://github.com/sympy/sympy/pull/16810">#16810</a> for the details. Thanks to <a href="https://github.com/smichr">Christopher Smith</a> for helping me with this. Still there are issues like, <a href="https://github.com/sympy/sympy/issues/16833">#16833</a> which require a lot of discussion before concrete implementation, will update you on this if some progress takes place.</p>
<p>Apart from coding, we had discussions on design of Markov chain and random matrices. I also made two prototypes at PR <a href="https://github.com/sympy/sympy/pull/16852">#16852</a>, and <a href="https://github.com/sympy/sympy/pull/16866">#16866</a> . We are currently having more to think on this part, like API, class structure, etc.</p>
<p>Overall, the bonding period was very a great learning experience about how various modules affect each other and bring the best out of SymPy. See you soon in the next blog. :)</p>Gagandeep SinghSo, here I am with another set of experiences to share with you, i.e., the Community Bonding of GSoC. After discussing with my fellow developers, and my mentors, we reframed our timeline. I will be working on very interesting topics in statistics, including, joint distributions, Markov chains, random matrices, assumptions of dependence and simplifying the results of the stats module. For a head start, I started working on adding multivariate distributions and the good news is that I have completed majority of the work, apart from some technical constraints which will be solved by another project very soon. If you are interested in seeing the code, take a look at the PRs #16576, #16808, #16825. Infact, apart from the stats module, I also had an opportunity to work on other modules of SymPy, like, I improved the API of Sum in sympy.concrete, allowing Range to be passed as limits and that too while working on discrete joint distributions. You can check the PR #16810 for the details. Thanks to Christopher Smith for helping me with this. Still there are issues like, #16833 which require a lot of discussion before concrete implementation, will update you on this if some progress takes place. Apart from coding, we had discussions on design of Markov chain and random matrices. I also made two prototypes at PR #16852, and #16866 . We are currently having more to think on this part, like API, class structure, etc. Overall, the bonding period was very a great learning experience about how various modules affect each other and bring the best out of SymPy. See you soon in the next blog. :)Beginning GSoC 20192019-05-07T00:00:00+00:002019-05-07T00:00:00+00:00https://czgdp1807.github.io/first_blog<p>I am very happy to share with you that on 6th May, 2019 I was accepted as a
GSoC student of <a href="https://www.sympy.org/">SymPy</a>. I would be working on statistics module under the guidance
of my mentors, <a href="https://github.com/Upabjojr">Francesco Bonazzi</a> and <a href="https://github.com/sidhantnagpal">Sidhant Nagpal</a>.</p>
<p>I know Francesco since I began contributing to stats module. We had discussions
on various issues like, complicated results by statistics module and inclusion
of stochastic processes in SymPy. I also worked with Sidhant while solving an
issue of wrong PDF of Gumbel distribution. I believe that they both are very great
people to learn from.</p>
<p>I will also work with Ritesh Kumar who has also been accepted to stats module. I
haven’t interacted with him yet. Let’s see how we coordinate our work to make this
project a success.</p>
<p>Infact, my first learning experience was using jekyll to set up this blog on github
pages. Since, this was my first usage of jekyll and github pages, I struggled a bit to find
the right resource. If you are also going to use jekyll then I would say, visit <a href="https://help.github.com/en/articles/setting-up-your-github-pages-site-locally-with-jekyll">this page</a>.</p>Gagandeep SinghI am very happy to share with you that on 6th May, 2019 I was accepted as a GSoC student of SymPy. I would be working on statistics module under the guidance of my mentors, Francesco Bonazzi and Sidhant Nagpal. I know Francesco since I began contributing to stats module. We had discussions on various issues like, complicated results by statistics module and inclusion of stochastic processes in SymPy. I also worked with Sidhant while solving an issue of wrong PDF of Gumbel distribution. I believe that they both are very great people to learn from. I will also work with Ritesh Kumar who has also been accepted to stats module. I haven’t interacted with him yet. Let’s see how we coordinate our work to make this project a success. Infact, my first learning experience was using jekyll to set up this blog on github pages. Since, this was my first usage of jekyll and github pages, I struggled a bit to find the right resource. If you are also going to use jekyll then I would say, visit this page.