Jekyll2019-08-20T17:28:20+00:00https://czgdp1807.github.io/feed.xmlOpenBlogI blog about my experience with open source community.Gagandeep SinghWeek 12 - Ending GSoC 20192019-08-20T00:00:00+00:002019-08-20T00:00:00+00:00https://czgdp1807.github.io/week_12<p>As the title suggests, with the third phase, the journey of my GSoC 2019 comes to an end. It was full of challanges, learning experiences, and above all interaction with the open source community of <code class="highlighter-rouge">SymPy</code>.<br />
In this blog post I will share with you the work done between phase 2 and phase 3, in terms of PRs, merged and open.</p>
<p><strong>Merged</strong></p>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17174">#17174</a> : In this PR, Gaussian ensembles were added to <code class="highlighter-rouge">sympy.stats</code>.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17304">#17304</a> : While working on the above PR, I got an idea to open this one to add cicular ensembles to <code class="highlighter-rouge">sympy.stats</code>. I learned a lot about Haar measure while working on this.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17306">#17306</a>: This PR added matrices with random expressions. The challenging part of this PR was to generate canonical results for passing the tests.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17336">#17336</a> : This was related to bug fix in <code class="highlighter-rouge">Q.ask</code> and <code class="highlighter-rouge">Matrix</code>. Take a look at an example <a href="https://github.com/sympy/sympy/pull/17336#issue-304058013">here</a>.</p>
</li>
</ul>
<p><strong>Open</strong></p>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17387">#17387</a> : This PR aims to add support for assumptions of dependence among random variables, like, <code class="highlighter-rouge">Covariance</code>, etc.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17146">#17146</a> : This PR is in its last stages to fix and upgrade the <code class="highlighter-rouge">Range</code> set and we are finalizing few things, like changes in the output of <code class="highlighter-rouge">Range</code>. As planned I was successful at writing exhaustive and systematic tests.</p>
</li>
</ul>
<p>Well, now, time to say good bye! It was a nice experience writing about journey in this blog. If you have read this from the beginning then thanks a lot buddy, and I wish for your acceptance in GSoC 2020. Keep Open Sourcing :D</p>Gagandeep SinghAs the title suggests, with the third phase, the journey of my GSoC 2019 comes to an end. It was full of challanges, learning experiences, and above all interaction with the open source community of SymPy. In this blog post I will share with you the work done between phase 2 and phase 3, in terms of PRs, merged and open. Merged #17174 : In this PR, Gaussian ensembles were added to sympy.stats. #17304 : While working on the above PR, I got an idea to open this one to add cicular ensembles to sympy.stats. I learned a lot about Haar measure while working on this. #17306: This PR added matrices with random expressions. The challenging part of this PR was to generate canonical results for passing the tests. #17336 : This was related to bug fix in Q.ask and Matrix. Take a look at an example here. Open #17387 : This PR aims to add support for assumptions of dependence among random variables, like, Covariance, etc. #17146 : This PR is in its last stages to fix and upgrade the Range set and we are finalizing few things, like changes in the output of Range. As planned I was successful at writing exhaustive and systematic tests. Well, now, time to say good bye! It was a nice experience writing about journey in this blog. If you have read this from the beginning then thanks a lot buddy, and I wish for your acceptance in GSoC 2020. Keep Open Sourcing :DFinal Report2019-08-20T00:00:00+00:002019-08-20T00:00:00+00:00https://czgdp1807.github.io/z_final_report<p>This report summarizes the work done in my GSoC 2019 project, <strong>Enhancement of Statistics Module</strong> wth SymPy. A step by step development of the project is available at <a href="https://czgdp1807.github.io">czgdp1807.github.io</a>.</p>
<p><strong>About Me</strong></p>
<p>I am a third year Bachelor of Technology student at Indian Institute of Technology, Jodhpur in the department of Computer Science and Engineering.</p>
<p><strong>Project Outline</strong></p>
<p>The project plan was focused on the following areas of statistics that were required to be added to <code class="highlighter-rouge">sympy.stats</code>.</p>
<ol>
<li><strong>Community Bonding</strong> - I was supposed to add, Dirichlet Distribution, Multivariate Ewens Distribution, Multinomial Distribution, Negative multinomial distribution, and Generalized multivariate log-gamma distribution to <code class="highlighter-rouge">sympy.stats.joint_rv_types</code>.</li>
<li><strong>Phase 1</strong> - I was supposed to work on stochastic processes, primraly on Markov chains, including it’s API design, algorithm and implementation.</li>
<li><strong>Phase 2</strong> - I was expected to work on random matrices, including Gaussian ensembles and matrices with random expressions as their elements.</li>
<li><strong>Phase 3</strong> - I planned to work on assumptions of dependence, improving result generation by <code class="highlighter-rouge">sympy.stats</code> and improving other modules so that <code class="highlighter-rouge">sympy.stats</code> can function properly.</li>
</ol>
<p><strong>Pull Requests</strong></p>
<p>This section describes the actual work done during the coding period in terms of merged PRs.</p>
<ol>
<li><strong>Community Bonding</strong></li>
</ol>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16576">#16576</a>: This PR added <code class="highlighter-rouge">Dirichlet</code> and <code class="highlighter-rouge">MultivariteEwens</code> distributions.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16808">#16808</a> : This PR added <code class="highlighter-rouge">Multinomial</code> and <code class="highlighter-rouge">NegativeMultinomial</code> distribution.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16810">#16810</a> : This PR improved the API of <code class="highlighter-rouge">Sum</code> by allowing <code class="highlighter-rouge">Range</code> as the limits.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16825">#16825</a> : This PR in continuation, added <code class="highlighter-rouge">GeneralizedMultivariateLogGamma</code> distribution. This was an interesting one due to the complexity involved in its PDF.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16834">#16834</a> : This PR enhanced the <code class="highlighter-rouge">Multinomial</code> and <code class="highlighter-rouge">NegativeMultinomial</code> distributions by allowing symbolic dimensions for them.</p>
</li>
</ul>
<ol>
<li><strong>Phase 1</strong></li>
</ol>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16897">#16897</a> : This was related to <code class="highlighter-rouge">sympy.core</code> and it helped in removing disparity in the results of special function <code class="highlighter-rouge">gamma</code>.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16908">#16908</a> : This PR improved <code class="highlighter-rouge">sympy.stats.frv</code> by allowing conditions with foriegn symbols.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16913">#16913</a> : This removed the unreachable code from <code class="highlighter-rouge">sympy.stats.frv</code>.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16914">#16914</a> : This PR allowed symbolic dimensions to <code class="highlighter-rouge">MultivariateEwens</code> distribution.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16929">#16929</a> : This one was for the <code class="highlighter-rouge">sympy.tensor</code> module. It optimized the <code class="highlighter-rouge">ArrayComprehension</code> and covered some corner cases.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16981">#16981</a> : This PR added the architecture of stochastic processes. It also added discrete Markov chain to <code class="highlighter-rouge">sympy.stats</code>.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17030">#17030</a> : Some features like, <code class="highlighter-rouge">joint_dsitribution</code> were added to stochastic processes in this PR.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17046">#17046</a> : Some common properties of discrete Markov chains, like fundamental matrix, fixed row vector were added.</p>
</li>
</ul>
<ol>
<li><strong>Phase 2</strong></li>
</ol>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16934">#16934</a> : The bug fixes for <code class="highlighter-rouge">sympy.stats.joint_rv_types</code> were complete and the further work has been handed over to my co-student, Ritesh.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16962">#16962</a> : This was continuation of the work done in phase 1 for allowing symbolic dimensions in finite random variables. As I planned, this PR got merged in phase 2, after some changes.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17083">#17083</a>: The work done in this PR framed the platform and reason for the next one. The algorithm that got merged was a bit difficult to extend, and maintain. Thanks to Francesco for his <a href="https://github.com/sympy/sympy/pull/17083#issuecomment-508256359">comment</a> for motivating me to re-think the whole framework.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17163">#17163</a> : This was one of the most challenging PRs of the project, because, it involved re-designing the algorithm, refactoring the code and moreover lot of thinking. The details can be found at <a href="https://github.com/sympy/sympy/pull/17163#issuecomment-510939984">this comment</a>.</p>
</li>
</ul>
<ol>
<li><strong>Phase 3</strong></li>
</ol>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17174">#17174</a> : In this PR, Gaussian ensembles were added to <code class="highlighter-rouge">sympy.stats</code>.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17304">#17304</a> : While working on the above PR, I got an idea to open this one to add cicular ensembles to <code class="highlighter-rouge">sympy.stats</code>. I learned a lot about Haar measure while working.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17306">#17306</a>: This PR added matrices with random expressions. The challenging part of this PR was to generate canonical results for passing the tests.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17336">#17336</a> : This was related to bug fix in <code class="highlighter-rouge">Q.ask</code> and <code class="highlighter-rouge">Matrix</code>. Take a look at an example <a href="https://github.com/sympy/sympy/pull/17336#issue-304058013">here</a>.</p>
</li>
</ul>
<p><strong>Miscellaneous Work</strong></p>
<p>This section contains some of my PRs related to miscellanous issues like, workflow improvement, etc.</p>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16899">#16899</a> : This was a workflow related to PR to ignore the <code class="highlighter-rouge">.vscode</code> folder.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17003">#17003</a> : This PR ignored the <code class="highlighter-rouge">__pycahce__</code> folder by adding it <code class="highlighter-rouge">.gitignore</code> file.</p>
</li>
</ul>
<p><strong>Future Work</strong></p>
<p>The following PRs are open and are in their last stages for merging. Any interested student can take a look at them to extend my work in his/her GSoC project.</p>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17387">#17387</a> : This PR aims to add support for assumptions of dependence among random variables, like, <code class="highlighter-rouge">Covariance</code>, etc.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17146">#17146</a> : This PR is in its last stages to fix and upgrade the <code class="highlighter-rouge">Range</code> set and we are finalizing few things, like changes in the output of <code class="highlighter-rouge">Range</code>. As planned I was successful at writing exhaustive and systematic tests.</p>
</li>
</ul>
<p>Apart from the above, work on densities of Circular ensembles remains to be done. One can read the Theorem 3, page 8 of <a href="https://arxiv.org/pdf/1103.3408.pdf">this paper</a>.</p>Gagandeep SinghThis report summarizes the work done in my GSoC 2019 project, Enhancement of Statistics Module wth SymPy. A step by step development of the project is available at czgdp1807.github.io. About Me I am a third year Bachelor of Technology student at Indian Institute of Technology, Jodhpur in the department of Computer Science and Engineering.Week 11 - Final touches2019-08-13T00:00:00+00:002019-08-13T00:00:00+00:00https://czgdp1807.github.io/week_11<p>So, the second last week of the project is over and we have decided to improve on the work we have done so far in the last few days. Read below to know more.</p>
<p>In this week, I worked on, <a href="https://github.com/sympy/sympy/pull/17146">#17146</a> concered with symbolic <code class="highlighter-rouge">Range</code>, <a href="https://github.com/sympy/sympy/pull/17387">#17387</a> related to assumptions of dependence among random variables, <a href="https://github.com/sympy/sympy/pull/17336">#17336</a> which fixed the bug in <code class="highlighter-rouge">Q.hermitian</code> the one I told you about in my previous post, and <a href="https://github.com/sympy/sympy/pull/17306">#17306</a>, implementing the matrices with random expressions.</p>
<p>In fact, the last two PRs are merged. Now, coming on to symbolic <code class="highlighter-rouge">Range</code>, I have completed the testing of all the methods except <code class="highlighter-rouge">slicing</code> feature of <code class="highlighter-rouge">__getitem__</code>, which I will do in this week. Regarding, the bug in <code class="highlighter-rouge">Q.hermitian</code>, well, my code at first, was giving incorrect results due to overriding problems in the logic. Francesco, helped me correct them and it’s finally in. The major part of the week was devoted to assumptions of dependence. I did some study from Wikipedia, and implemented the class <code class="highlighter-rouge">DependentPSpace</code>. I have kept the class static because it will handle queries of the type, <code class="highlighter-rouge">density(X + Y, Eq(Covariance(X, Y), S(1)/2)</code> which from my point of view doesn’t require creation of a probability space object.</p>
<p>Coming on to the plan for the last week, we have decided that no new PRs will be opened and focus will be towards completing the already open PRs, so that we have most of our work completed. Francesco has also suggested to test the newly introduced classes with the ones of Wolfram Alpha, so that there are no inconsistencies.</p>Gagandeep SinghSo, the second last week of the project is over and we have decided to improve on the work we have done so far in the last few days. Read below to know more. In this week, I worked on, #17146 concered with symbolic Range, #17387 related to assumptions of dependence among random variables, #17336 which fixed the bug in Q.hermitian the one I told you about in my previous post, and #17306, implementing the matrices with random expressions. In fact, the last two PRs are merged. Now, coming on to symbolic Range, I have completed the testing of all the methods except slicing feature of __getitem__, which I will do in this week. Regarding, the bug in Q.hermitian, well, my code at first, was giving incorrect results due to overriding problems in the logic. Francesco, helped me correct them and it’s finally in. The major part of the week was devoted to assumptions of dependence. I did some study from Wikipedia, and implemented the class DependentPSpace. I have kept the class static because it will handle queries of the type, density(X + Y, Eq(Covariance(X, Y), S(1)/2) which from my point of view doesn’t require creation of a probability space object. Coming on to the plan for the last week, we have decided that no new PRs will be opened and focus will be towards completing the already open PRs, so that we have most of our work completed. Francesco has also suggested to test the newly introduced classes with the ones of Wolfram Alpha, so that there are no inconsistencies.Week 10 - Debugging, testing and Haar measure2019-08-06T00:00:00+00:002019-08-06T00:00:00+00:00https://czgdp1807.github.io/week_10<p>This week was about a lot of debugging and testing. I also got to know some facts about random matrices and group theory.</p>
<p>With the ending of 10th week, we have entered the second last week of the project. Well, this week was full of finding bugs, correcting and testing them. Mainly, I worked on, <a href="https://github.com/sympy/sympy/pull/17146">#17146</a>, <a href="https://github.com/sympy/sympy/pull/17304">#17304</a>, <a href="https://github.com/sympy/sympy/pull/17336">#17336</a> and <a href="https://github.com/sympy/sympy/pull/17306">#17306</a>. The first one was related to symbolic <code class="highlighter-rouge">Range</code>, and it lacked systematic and robust tests. I pushed some commits to resolve the issue, though more is to be done. Now, coming to the second PR, it was related to circular ensembles. I got to know that distribution of these ensembles is something called Haar measure on <code class="highlighter-rouge">U(n)</code>, group of unitary matrices. I was not familiar with this. Thanks to <a href="https://github.com/jksuom">jksuom</a> for sharing some papers for the same. I will go through them in the following week. The third PR fixes a bug which was found while working on circular ensembles. Acutally, <code class="highlighter-rouge">ask(Q.hermitian(Matrix([[2, 2 + I, 4], [2 - I, 3, I], [4, -I, 1]])))</code> was giving <code class="highlighter-rouge">False</code>, however clearly the matrix is hermitian. So, I went ahead fixing it and waiting for reviews on my approach. The last one is related to matrices with random elements and it is complete after fixing a few bugs related to canonical outputs.</p>
<p>What I learnt this week?
Well, I learnt, <strong>When you think your work is complete, well, sorry to say, that’s the beginning ;-)</strong></p>
<p>Bye!!</p>Gagandeep SinghThis week was about a lot of debugging and testing. I also got to know some facts about random matrices and group theory. With the ending of 10th week, we have entered the second last week of the project. Well, this week was full of finding bugs, correcting and testing them. Mainly, I worked on, #17146, #17304, #17336 and #17306. The first one was related to symbolic Range, and it lacked systematic and robust tests. I pushed some commits to resolve the issue, though more is to be done. Now, coming to the second PR, it was related to circular ensembles. I got to know that distribution of these ensembles is something called Haar measure on U(n), group of unitary matrices. I was not familiar with this. Thanks to jksuom for sharing some papers for the same. I will go through them in the following week. The third PR fixes a bug which was found while working on circular ensembles. Acutally, ask(Q.hermitian(Matrix([[2, 2 + I, 4], [2 - I, 3, I], [4, -I, 1]]))) was giving False, however clearly the matrix is hermitian. So, I went ahead fixing it and waiting for reviews on my approach. The last one is related to matrices with random elements and it is complete after fixing a few bugs related to canonical outputs. What I learnt this week? Well, I learnt, When you think your work is complete, well, sorry to say, that’s the beginning ;-) Bye!!Week 9 - Lots of reviews2019-07-29T00:00:00+00:002019-07-29T00:00:00+00:00https://czgdp1807.github.io/week_9<p>This week I recieved a lot of reviews from the members of community on my various PRs and this has formed the base of the work for the next week. Let me share some of those reviews with you.</p>
<p>As I told you that the PR <a href="https://github.com/sympy/sympy/pull/17146">#17146</a> was pending for reviews. Well, I received a lot of comments from <a href="https://github.com/oscarbenjamin">@oscarbenjamin</a> and <a href="https://github.com/smichr">@smichr</a> on pretty printing of symbolic <code class="highlighter-rouge">Range</code>, the way tests are written, about <code class="highlighter-rouge">inf</code> and <code class="highlighter-rouge">sup</code> of <code class="highlighter-rouge">Range</code>. This in turn helped me to discover bugs in other features of <code class="highlighter-rouge">Range</code>, like, <code class="highlighter-rouge">reversed</code>. In the following week, I will work on this stuff and will correct the things. Now moving on to the random matrices, i.e., the PR <a href="https://github.com/sympy/sympy/pull/17174">#17174</a> has been merged but more work is to be done for <code class="highlighter-rouge">Matrix</code> with entries as random variables. In fact, I studied about expressions of random matrices and summarised the results <a href="https://github.com/sympy/sympy/pull/17174#issuecomment-514985333">here</a>. Though the findings suggest specific algorithms for specific expressions like sum. I am still looking for a more generalized technique and will update you if found any.</p>
<p>So, coming to the learning aspect. This week I learnt about the importance of exhaustive and systematic tests. The tests which I wrote for symbolic <code class="highlighter-rouge">Range</code> aren’t so systematic and robust. I have found a way to improve them from <a href="https://github.com/sympy/sympy/pull/17146#discussion_r307971324">this comment</a>.</p>
<p>That’s all for now, signing off!!</p>Gagandeep SinghThis week I recieved a lot of reviews from the members of community on my various PRs and this has formed the base of the work for the next week. Let me share some of those reviews with you. As I told you that the PR #17146 was pending for reviews. Well, I received a lot of comments from @oscarbenjamin and @smichr on pretty printing of symbolic Range, the way tests are written, about inf and sup of Range. This in turn helped me to discover bugs in other features of Range, like, reversed. In the following week, I will work on this stuff and will correct the things. Now moving on to the random matrices, i.e., the PR #17174 has been merged but more work is to be done for Matrix with entries as random variables. In fact, I studied about expressions of random matrices and summarised the results here. Though the findings suggest specific algorithms for specific expressions like sum. I am still looking for a more generalized technique and will update you if found any. So, coming to the learning aspect. This week I learnt about the importance of exhaustive and systematic tests. The tests which I wrote for symbolic Range aren’t so systematic and robust. I have found a way to improve them from this comment. That’s all for now, signing off!!Week 8 - Heading towards completion2019-07-22T00:00:00+00:002019-07-22T00:00:00+00:00https://czgdp1807.github.io/week_8<p>With the 8th week, the second phase of my project is complete and we are heading towards the end of GSoC 2019.
This blog post summarises the work done between phase 1 and phase 3, in terms of PRs. Moreover, I will share with you my plans for the last phase.</p>
<p>I worked on the following PRs(listed in chronological order) during the second phase, some of them got merged and few are open.</p>
<p><strong>Merged</strong></p>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16934">#16934</a> : The bug fixes were complete and the further work has been handed over to my co-student, Ritesh.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16962">#16962</a> : This was continuation of the work done in phase 1 for allowing symbolic dimensions in finite random variables. As I planned, this PR got merged in phase 2, after some changes.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17083">#17083</a>: The work done in this PR framed the platform and reason for the next one. The algorithm that got merged was a bit difficult to extend, and maintain. Thanks to Francesco for his <a href="https://github.com/sympy/sympy/pull/17083#issuecomment-508256359">comment</a> for motivating me to re-think the whole framework.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17163">#17163</a> : This was one of the most challenging PRs of the project till now, because, it involved re-designing the algorithm, refactoring the code and moreover lot of thinking. The details can be found at <a href="https://github.com/sympy/sympy/pull/17163#issuecomment-510939984">this comment</a>.</p>
</li>
</ul>
<p><strong>Open</strong></p>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17174">#17174</a> : This PR aims at adding random matrices to sympy. Currently, I am studying about the expressions involving random matrices and computing their densities.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17146">#17146</a> : This PR aims at allowing symbolic parameters to <code class="highlighter-rouge">Range</code>. The work is complete and I am waiting for final reviews. Hopefully it gets merged.</p>
</li>
</ul>
<p>Apart from the above PRs I also reviewed code written by other contributors.
Overall, according to me phase 2 was a great learning and logical experience.</p>
<p>Let me share with you the outline of my plan for phase 3.</p>
<p>The outline is given below,</p>
<ol>
<li>
<p>I will work on enhancing the result generation by the stats module as was planned in the beginning.</p>
</li>
<li>
<p>The work on random matrices will be extended and I will add more features to them.</p>
</li>
<li>
<p>Probably, if time permits, I will try to do some more refactoring of <code class="highlighter-rouge">sympy.stats.frv</code>, and will try to add some more stochastic processes.</p>
</li>
<li>
<p>I will also try to merge my open PRs.</p>
</li>
</ol>
<p>Thanks for reading and see you soon in phase 3. Bye!!</p>Gagandeep SinghWith the 8th week, the second phase of my project is complete and we are heading towards the end of GSoC 2019. This blog post summarises the work done between phase 1 and phase 3, in terms of PRs. Moreover, I will share with you my plans for the last phase. I worked on the following PRs(listed in chronological order) during the second phase, some of them got merged and few are open. Merged #16934 : The bug fixes were complete and the further work has been handed over to my co-student, Ritesh. #16962 : This was continuation of the work done in phase 1 for allowing symbolic dimensions in finite random variables. As I planned, this PR got merged in phase 2, after some changes. #17083: The work done in this PR framed the platform and reason for the next one. The algorithm that got merged was a bit difficult to extend, and maintain. Thanks to Francesco for his comment for motivating me to re-think the whole framework. #17163 : This was one of the most challenging PRs of the project till now, because, it involved re-designing the algorithm, refactoring the code and moreover lot of thinking. The details can be found at this comment. Open #17174 : This PR aims at adding random matrices to sympy. Currently, I am studying about the expressions involving random matrices and computing their densities. #17146 : This PR aims at allowing symbolic parameters to Range. The work is complete and I am waiting for final reviews. Hopefully it gets merged.Week 7 - All about logics and algorithms2019-07-15T00:00:00+00:002019-07-15T00:00:00+00:00https://czgdp1807.github.io/week_7<p>This week required a lot of thinking before jumping to code the stuff. Interested? Okay move on to next paragraph.</p>
<p>Basically, I worked on three PRs, <a href="https://github.com/sympy/sympy/pull/17163">#17163</a> for continuous time Markov chains, <a href="https://github.com/sympy/sympy/pull/17174">#17174</a> for random matrices and <a href="https://github.com/sympy/sympy/pull/17146">#17146</a> for symbolic Ranges. The first and the last PRs are very much intensive. I developed a new algorithm for the query handler of <code class="highlighter-rouge">ContinuousMarkovChain.probability</code> method, because the previous one which I implemented in <code class="highlighter-rouge">DiscreteMarkovChain.probability</code>, was not easy to maintain, quite ad-hoc, rigid and difficult to extend. The philosophy behind the algorithm is recursion i.e., boil everything down to <code class="highlighter-rouge">Relational</code> query, convert them to sets and then calculate the probability. You can find the complete description <a href="https://github.com/sympy/sympy/pull/17163#issuecomment-510939984">here</a>. I am waiting for any critical objections from my mentors and after that I will refactor the code as suggested by <a href="https://github.com/oscarbenjamin">oscarbenjamin</a> and <a href="https://github.com/jksuom">jksuom</a>. So, now let’s move on to random matrices. As it was to be implemented from scratch, it required a bit of thinking to reach a decent architecture. Currently, the PR is at a basic level, and some more testing is to be done. Now, coming on to symbolic <code class="highlighter-rouge">Range</code>. Let me tell you, it requires a lot of logical thinking to make <code class="highlighter-rouge">Range</code> accept symbolic parameters. A lot of tests fail, and a lot of debugging has to be done to make a method work. In fact, we might deprecate <code class="highlighter-rouge">xrange</code> support from <code class="highlighter-rouge">Range</code> because we are going to drop <code class="highlighter-rouge">Python 2</code> support from <code class="highlighter-rouge">SymPy</code>.</p>
<p>This week I learnt to combine the concepts from algorithms and software engineering to develop the stuff I mentioned above. This was the best week of my overall GSoC experience till now.</p>
<p>A lot more lies ahead. Bye!!</p>Gagandeep SinghThis week required a lot of thinking before jumping to code the stuff. Interested? Okay move on to next paragraph. Basically, I worked on three PRs, #17163 for continuous time Markov chains, #17174 for random matrices and #17146 for symbolic Ranges. The first and the last PRs are very much intensive. I developed a new algorithm for the query handler of ContinuousMarkovChain.probability method, because the previous one which I implemented in DiscreteMarkovChain.probability, was not easy to maintain, quite ad-hoc, rigid and difficult to extend. The philosophy behind the algorithm is recursion i.e., boil everything down to Relational query, convert them to sets and then calculate the probability. You can find the complete description here. I am waiting for any critical objections from my mentors and after that I will refactor the code as suggested by oscarbenjamin and jksuom. So, now let’s move on to random matrices. As it was to be implemented from scratch, it required a bit of thinking to reach a decent architecture. Currently, the PR is at a basic level, and some more testing is to be done. Now, coming on to symbolic Range. Let me tell you, it requires a lot of logical thinking to make Range accept symbolic parameters. A lot of tests fail, and a lot of debugging has to be done to make a method work. In fact, we might deprecate xrange support from Range because we are going to drop Python 2 support from SymPy. This week I learnt to combine the concepts from algorithms and software engineering to develop the stuff I mentioned above. This was the best week of my overall GSoC experience till now. A lot more lies ahead. Bye!!Week 6 - Some extensions2019-07-08T00:00:00+00:002019-07-08T00:00:00+00:00https://czgdp1807.github.io/week_6<p>This week was a mix of discussion on design and extending previous work. I also got to know about some new cool features of <code class="highlighter-rouge">SymPy</code>.</p>
<p>According to the plan proposed in <a href="https://czgdp1807.github.io/week_4/">Week 4</a>, I have completed my work on <code class="highlighter-rouge">DiscreteMarkovChain</code> via <a href="https://github.com/sympy/sympy/pull/17083">PR #17083</a>. I used the <code class="highlighter-rouge">as_set</code> and <code class="highlighter-rouge">as_relational</code> methods which helped me to cover many miscellaneous cases and probably, now <code class="highlighter-rouge">DiscreteMarkovChain</code> is quite dynamic and can handle various generic <code class="highlighter-rouge">probability</code> and <code class="highlighter-rouge">expectation</code> queries. I have also started the <a href="https://github.com/sympy/sympy/pull/17163">PR #17163</a> for adding <code class="highlighter-rouge">ContinuousMarkovChain</code> and I am observing that it’s a bit tricky to maintain both the performance and result quality while working on it. Now, moving on to symbolic <code class="highlighter-rouge">Range</code>,well, the work has been started in the <a href="https://github.com/sympy/sympy/pull/17146">PR #17146</a> and I have figured out one disparity between <code class="highlighter-rouge">Range</code> and python’s <code class="highlighter-rouge">range</code>(details available at <a href="https://github.com/sympy/sympy/pull/17146#discussion_r300162219">this thread</a>). I will try to fix it by making minimal changes to the code. The tensorflow related <a href="https://github.com/sympy/sympy/pull/17103">PR #17103</a> which I started in the previous week is also almost complete and is waiting for <code class="highlighter-rouge">Tensorflow 2.0</code> release. I am also studying a bit about the architecture of the above framework to make changes to <code class="highlighter-rouge">lambdify</code>. Regarding random matrices, I believe that discussion has reached its final stages and I am waiting for the comments from Francesco for improvements at the issue <a href="https://github.com/sympy/sympy/issues/17039">#17039</a>.</p>
<p>Let me share with you about my discoveries and learnings in this week. Well, thanks to Francesco for telling me about, <code class="highlighter-rouge">sympy.multipledispatch</code>. It helps in implementing operator overloading like in C/C++. I liked it very much. I also read about continuous Markov chain and discovered about generator matrix, forward and backward equations. Adding one interesting fact, that Poisson process and continuous Markov chain are very closely related via generator matrices it will make the implementation of the former much easier :D.</p>
<p>Leaving you for now, Bye!!</p>Gagandeep SinghThis week was a mix of discussion on design and extending previous work. I also got to know about some new cool features of SymPy. According to the plan proposed in Week 4, I have completed my work on DiscreteMarkovChain via PR #17083. I used the as_set and as_relational methods which helped me to cover many miscellaneous cases and probably, now DiscreteMarkovChain is quite dynamic and can handle various generic probability and expectation queries. I have also started the PR #17163 for adding ContinuousMarkovChain and I am observing that it’s a bit tricky to maintain both the performance and result quality while working on it. Now, moving on to symbolic Range,well, the work has been started in the PR #17146 and I have figured out one disparity between Range and python’s range(details available at this thread). I will try to fix it by making minimal changes to the code. The tensorflow related PR #17103 which I started in the previous week is also almost complete and is waiting for Tensorflow 2.0 release. I am also studying a bit about the architecture of the above framework to make changes to lambdify. Regarding random matrices, I believe that discussion has reached its final stages and I am waiting for the comments from Francesco for improvements at the issue #17039. Let me share with you about my discoveries and learnings in this week. Well, thanks to Francesco for telling me about, sympy.multipledispatch. It helps in implementing operator overloading like in C/C++. I liked it very much. I also read about continuous Markov chain and discovered about generator matrix, forward and backward equations. Adding one interesting fact, that Poisson process and continuous Markov chain are very closely related via generator matrices it will make the implementation of the former much easier :D. Leaving you for now, Bye!!Week 5 - Transition towards Phase 22019-07-02T00:00:00+00:002019-07-02T00:00:00+00:00https://czgdp1807.github.io/week_5<p>The evaluation results for phase 1 are out, and I am very glad to share with you that I have passed with flying colors. I received, “Well done so far.” as the feedback for my work till now.</p>
<p>So now let us move to the work done in the gap between phase 1 and phase 2. Firstly, both of my open PRs of the previous phase, i.e., <a href="https://github.com/sympy/sympy/pull/16962">#16962</a> and <a href="https://github.com/sympy/sympy/pull/16934">#16934</a> have been merged. Though for symbolic dimensions some more work has to be done to make <code class="highlighter-rouge">sympy.stats.frv</code> more efficient and maintainable. I have also started my work, PR <a href="https://github.com/sympy/sympy/pull/17083">#17083</a>, to extend the scope of queries for <code class="highlighter-rouge">DiscreteMarkovChain</code> and the system has become a bit smarter. In fact, during this week, while working on the PR, <a href="https://github.com/sympy/sympy/pull/17103">#17103</a>, I came across the news that Tensorflow has changed a lot of APIs while migrating from 1.x to 2.x. AFAIK, they are moving towards <code class="highlighter-rouge">Function</code> approach from the previous <code class="highlighter-rouge">Session</code> approach, and due to that, SymPy’s <code class="highlighter-rouge">lambdify</code> faced some issues which I will be fixing soon with the help of other members. The Tensorflow details can be seen <a href="https://github.com/tensorflow/community/blob/b1d83bf2ee3fc72650140b89656e29932db36226/rfcs/20180918-functions-not-sessions-20.md">here</a>.</p>
<p>Now, let’s move to the learning part. During the transition period I learnt about the dependencies of <code class="highlighter-rouge">SymPy</code>. Moreover, I came across, how, some bugs can be unnoticed when left untested. Thanks again to <a href="https://github.com/oscarbenjamin">oscarbenjamin</a> for letting me know about the bugs related to variance of finite random variables. I also got to know that, how bare <code class="highlighter-rouge">except</code> can even catch keyboard interrupt and that’s what makes it quite vulnerable. Thanks to <a href="https://github.com/sidhantnagpal">sidhantnagpal</a> for helping me with this.</p>
<p>So, that’s all for this, see you next week. Bye!!</p>Gagandeep SinghThe evaluation results for phase 1 are out, and I am very glad to share with you that I have passed with flying colors. I received, “Well done so far.” as the feedback for my work till now. So now let us move to the work done in the gap between phase 1 and phase 2. Firstly, both of my open PRs of the previous phase, i.e., #16962 and #16934 have been merged. Though for symbolic dimensions some more work has to be done to make sympy.stats.frv more efficient and maintainable. I have also started my work, PR #17083, to extend the scope of queries for DiscreteMarkovChain and the system has become a bit smarter. In fact, during this week, while working on the PR, #17103, I came across the news that Tensorflow has changed a lot of APIs while migrating from 1.x to 2.x. AFAIK, they are moving towards Function approach from the previous Session approach, and due to that, SymPy’s lambdify faced some issues which I will be fixing soon with the help of other members. The Tensorflow details can be seen here. Now, let’s move to the learning part. During the transition period I learnt about the dependencies of SymPy. Moreover, I came across, how, some bugs can be unnoticed when left untested. Thanks again to oscarbenjamin for letting me know about the bugs related to variance of finite random variables. I also got to know that, how bare except can even catch keyboard interrupt and that’s what makes it quite vulnerable. Thanks to sidhantnagpal for helping me with this. So, that’s all for this, see you next week. Bye!!Week 4 - Phase 1 Ends2019-06-23T00:00:00+00:002019-06-23T00:00:00+00:00https://czgdp1807.github.io/week_4<p>So, with the fourth week, the first phase of my journey with <code class="highlighter-rouge">SymPy</code> has come to an end. This blog post summarises the work done till now, in terms of PRs. Moreover, I will share with you the plans for phase 2.</p>
<p>I worked on the following PRs(listed in chronological order) during the first phase, many of them got merged and few are open.</p>
<p><strong>Merged</strong></p>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16528">#16528</a> : I worked on extending the <code class="highlighter-rouge">GumbelDistribution</code> to support both minimum and maximum versions of it.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16576">#16576</a>: This PR added <code class="highlighter-rouge">Dirichlet</code> and <code class="highlighter-rouge">MultivariteEwens</code> distributions.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16808">#16808</a> : This PR added <code class="highlighter-rouge">Multinomial</code> and <code class="highlighter-rouge">NegativeMultinomial</code> distribution.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16810">#16810</a> : This PR improved the API of <code class="highlighter-rouge">Sum</code> by allowing <code class="highlighter-rouge">Range</code> as the limits.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16825">#16825</a> : This PR in continuation, added <code class="highlighter-rouge">GeneralizedMultivariateLogGamma</code> distribution. This was an interesting one due to the complexity involved in its PDF.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16834">#16834</a> : This PR enhanced the <code class="highlighter-rouge">Multinomial</code> and <code class="highlighter-rouge">NegativeMultinomial</code> distributions by allowing symbolic dimensions for them.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16897">#16897</a> : This was related to <code class="highlighter-rouge">sympy.core</code> and it helped in removing disparity in the results of special function <code class="highlighter-rouge">gamma</code>.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16899">#16899</a> : This was a workflow related to PR to ignore the <code class="highlighter-rouge">.vscode</code> folder.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16908">#16908</a> : This PR improved <code class="highlighter-rouge">sympy.stats.frv</code> by allowing conditions with foriegn symbols.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16913">#16913</a> : This removed the unreachable code from <code class="highlighter-rouge">sympy.stats.frv</code>.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16914">#16914</a> : This PR allowed symbolic dimensions to <code class="highlighter-rouge">MultivariateEwens</code> distribution.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16929">#16929</a> : This one was for the <code class="highlighter-rouge">sympy.tensor</code> module. It optimized the <code class="highlighter-rouge">ArrayComprehension</code> and covered some corner cases.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16981">#16981</a> : This PR added the architecture of stochastic processes. It also added discrete Markov chain to <code class="highlighter-rouge">sympy.stats</code>.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17003">#17003</a> : This PR ignored the <code class="highlighter-rouge">__pycahce__</code> folder by adding it <code class="highlighter-rouge">.gitignore</code> file.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17030">#17030</a> : Some features like, <code class="highlighter-rouge">joint_dsitribution</code> were added in this PR.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/17046">#17046</a> : Some common properties of discrete Markov chains, like fundamental matrix, fixed row vector were added.</p>
</li>
</ul>
<p><strong>Open</strong></p>
<ul>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16962">#16962</a> : The aim of this PR is to allow symbolic dimensions to single finite distributions, like <code class="highlighter-rouge">Die</code>, <code class="highlighter-rouge">Binomial</code>. The work from my side is complete on this.</p>
</li>
<li>
<p><a href="https://github.com/sympy/sympy/pull/16934">#16934</a> : This PR aims to fix the bugs and test the code introduced in <code class="highlighter-rouge">GSoC</code> 2018.</p>
</li>
</ul>
<p>Apart from the above PRs I also reviewed code written by other contributors.
Overall, according to me phase 1 was a great learning and working experience.</p>
<p>Let me share with you my plan for phase 2. Since, I will be working on random matrices during the upcoming phase, I have started the discussions for design with my mentors and things are taking shape. However, apart from random matrices, I will also work on few more general improvements for my phase 1 work.</p>
<p>The outline is given below,</p>
<ol>
<li>
<p>I will work on extending the scope of queries handeled by discrete Markov chains by covering some uncommon cases.</p>
</li>
<li>
<p>Some work will be done to extend the <code class="highlighter-rouge">DiscreteMarkovChain</code> by adding <code class="highlighter-rouge">ContinuousMarkovChain</code> as the latter is almost same but with some extra parameters.</p>
</li>
<li>
<p>I will implement random matrices and its various Gaussian ensembles according to the conclusion of the design discussions.</p>
</li>
<li>
<p>Last but not the least, I will try to merge my open PRs.</p>
</li>
</ol>
<p>Thanks for reading and see you soon in phase 2. Bye!!</p>Gagandeep SinghSo, with the fourth week, the first phase of my journey with SymPy has come to an end. This blog post summarises the work done till now, in terms of PRs. Moreover, I will share with you the plans for phase 2. I worked on the following PRs(listed in chronological order) during the first phase, many of them got merged and few are open. Merged #16528 : I worked on extending the GumbelDistribution to support both minimum and maximum versions of it. #16576: This PR added Dirichlet and MultivariteEwens distributions. #16808 : This PR added Multinomial and NegativeMultinomial distribution. #16810 : This PR improved the API of Sum by allowing Range as the limits. #16825 : This PR in continuation, added GeneralizedMultivariateLogGamma distribution. This was an interesting one due to the complexity involved in its PDF. #16834 : This PR enhanced the Multinomial and NegativeMultinomial distributions by allowing symbolic dimensions for them. #16897 : This was related to sympy.core and it helped in removing disparity in the results of special function gamma. #16899 : This was a workflow related to PR to ignore the .vscode folder. #16908 : This PR improved sympy.stats.frv by allowing conditions with foriegn symbols. #16913 : This removed the unreachable code from sympy.stats.frv. #16914 : This PR allowed symbolic dimensions to MultivariateEwens distribution. #16929 : This one was for the sympy.tensor module. It optimized the ArrayComprehension and covered some corner cases. #16981 : This PR added the architecture of stochastic processes. It also added discrete Markov chain to sympy.stats. #17003 : This PR ignored the __pycahce__ folder by adding it .gitignore file. #17030 : Some features like, joint_dsitribution were added in this PR. #17046 : Some common properties of discrete Markov chains, like fundamental matrix, fixed row vector were added. Open #16962 : The aim of this PR is to allow symbolic dimensions to single finite distributions, like Die, Binomial. The work from my side is complete on this. #16934 : This PR aims to fix the bugs and test the code introduced in GSoC 2018.