[gsoc] Suggestions on Proposal for Boost Document Library Project

older
[endian] Floating point reversal...

Anurag Ghosh

18 Mar 2015 18 Mar '15

1:59 p.m.

Hello Everyone I'm Anurag Ghosh, a 2nd year undergraduate student studying in IIIT-Hyderabad, India and I'm interested in making the Boost Document Library as a part of the Google Summer of Code 2015 Program. My proposal document can be viewed at https://github.com/anuragxel/boost-generic-document-library/wiki/Google-Summ... Kindly provide me with comments on the project proposal, as it may seem I have missed out on something or the other. I'm hopeful that a discussion would be very helpful in enriching my proposal. Also, I have developed a working prototype (only for OpenOffice/LibreOffice API currently) for this project (given as a competency test) whose code is hosted at https://github.com/anuragxel/boost-generic-document-library/ Also, Kindly suggest any changes as may deem fit to the code, I will make the changes appropriately. Thanks Anurag Ghosh (anuragxel)

Show replies by date

Stefan Seefeld

18 Mar 18 Mar

3:07 p.m.

New subject: [gsoc] Suggestions on Proposal for Boost Document Library Project

Anurag, On 18/03/15 09:59 AM, Anurag Ghosh wrote:

...

Hello Everyone

I'm Anurag Ghosh, a 2nd year undergraduate student studying in IIIT-Hyderabad, India and I'm interested in making the Boost Document Library as a part of the Google Summer of Code 2015 Program.

My proposal document can be viewed at https://github.com/anuragxel/boost-generic-document-library/wiki/Google-Summ...

Kindly provide me with comments on the project proposal, as it may seem I have missed out on something or the other. I'm hopeful that a discussion would be very helpful in enriching my proposal.

Having worked a fair bit with documents, and in particular programmatic processing of documents, I believe I can relate to the appeal to a high-level API to facilitate the manipulation of such documents. However, I think this requires a bit more thought. For one, I find your proposal hugely ambitious. In other words, I have doubts that you can achieve all the things you propose in a short period as this. Second, I don't think an interface to existing office suites is the right approach to the problem. Rather, I would suggest something based on existing standard technologies such as XML (and DocBook in particular), to support the manipulation of structured documents. Note that last year we had a GSoC project to advance the state of a (proposed) Boost.XML library (which I mentored). I believe it's straight forward to build higher-level APIs on top of that to manipulate documents on a more "semantic" level, and then leave it to the various office suites to handle the import & export the chosen format (Libre- and Open-Office already support DocBook). See https://github.com/stefanseefeld/boost.xml. Regards, Stefan -- ...ich hab' noch einen Koffer in Berlin...

Antony Polukhin

4:18 p.m.

New subject: [gsoc] Suggestions on Proposal for Boost Document Library Project

2015-03-18 19:07 GMT+04:00 Stefan Seefeld <stefan@seefeld.name>: <...>

...

Second, I don't think an interface to existing office suites is the right approach to the problem. Rather, I would suggest something based on existing standard technologies such as XML (and DocBook in particular), to support the manipulation of structured documents.

Without using existing Office suit API, student will be forced to rewrite the functionality of Open Office from scratch. Working with spreadsheets is not just parsing document, but also evaluating functions, plotting charts and so on... This is hell and nightmare. And it will take insane amount of time ( about 2,190 years of effort <https://www.openhub.net/p/libreoffice/estimated_cost>) So the approach with unification of APIs seems right to me. -- Best regards, Antony Polukhin

Stefan Seefeld

4:32 p.m.

New subject: [gsoc] Suggestions on Proposal for Boost Document Library Project

On 18/03/15 12:18 PM, Antony Polukhin wrote:

...

2015-03-18 19:07 GMT+04:00 Stefan Seefeld <stefan@seefeld.name>: <...>

...
Second, I don't think an interface to existing office suites is the right approach to the problem. Rather, I would suggest something based on existing standard technologies such as XML (and DocBook in particular), to support the manipulation of structured documents.

Without using existing Office suit API, student will be forced to rewrite the functionality of Open Office from scratch. Working with spreadsheets is not just parsing document, but also evaluating functions, plotting charts and so on... This is hell and nightmare. And it will take insane amount of time ( about 2,190 years of effort <https://www.openhub.net/p/libreoffice/estimated_cost>)

So the approach with unification of APIs seems right to me.

Well, yes, that's what I meant with "manipulating documents on a semantic level". I agree, writing this from scratch is wrong. But just providing a programmatic interface to an office suite seems ill-designed to me, at least for a project other than LibreOffice itself. For Boost I think one should at least attempt to build the functionality on top of a standard document model such as XML/DocBook. Stefan -- ...ich hab' noch einen Koffer in Berlin...

Anurag Ghosh

8:03 p.m.

New subject: [gsoc] Suggestions on Proposal for Boost Document Library Project

Sir Could you point to the specific parts which you think make the project hugely ambitious ? Is it that the functionalities that I'm proposing too many (ie. the scope is too broad) or that the different API's I'm thinking to cover over different platforms a bit too difficult to achieve ? Thanks Anurag Ghosh On Wed, Mar 18, 2015 at 10:02 PM, Stefan Seefeld <stefan@seefeld.name> wrote:

...

On 18/03/15 12:18 PM, Antony Polukhin wrote:

...
2015-03-18 19:07 GMT+04:00 Stefan Seefeld <stefan@seefeld.name>: <...>

...
Second, I don't think an interface to existing office suites is the right approach to the problem. Rather, I would suggest something based on existing standard technologies such as XML (and DocBook in particular), to support the manipulation of structured documents.

Without using existing Office suit API, student will be forced to rewrite the functionality of Open Office from scratch. Working with spreadsheets is not just parsing document, but also evaluating functions, plotting charts and so on... This is hell and nightmare. And it will take insane amount of time ( about 2,190 years of effort <https://www.openhub.net/p/libreoffice/estimated_cost>)

So the approach with unification of APIs seems right to me.

Well, yes, that's what I meant with "manipulating documents on a semantic level". I agree, writing this from scratch is wrong. But just providing a programmatic interface to an office suite seems ill-designed to me, at least for a project other than LibreOffice itself.

For Boost I think one should at least attempt to build the functionality on top of a standard document model such as XML/DocBook.

Stefan

--

...ich hab' noch einen Koffer in Berlin...

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Stefan Seefeld

8:32 p.m.

New subject: [gsoc] Suggestions on Proposal for Boost Document Library Project

On 18/03/15 04:03 PM, Anurag Ghosh wrote:

...

Sir

Could you point to the specific parts which you think make the project hugely ambitious ? Is it that the functionalities that I'm proposing too many (ie. the scope is too broad) or that the different API's I'm thinking to cover over different platforms a bit too difficult to achieve ?

Both. From a project planning perspective I would suggest to start with a single API to bind to, covering small use-cases at a time (enable to create or modify small text documents, create or edit small spreadsheets, etc.), rather than a big and generic "expose OOo functionality as C++ API". Once that works you may consider adding support for other backends (other word-processors ?). But again, I'm not even convinced that this is in scope for Boost.org. As I said, I would rather see such functionality covered in application-agnostic terms based on a structured document model. While still not necessarily in scope for Boost.org, at least it's more driven by Open Architecture considerations. Stefan -- ...ich hab' noch einen Koffer in Berlin...

3776

Age (days ago)

3776

Last active (days ago)

List overview

Download

5 comments

3 participants

participants (3)

Antony Polukhin
Anurag Ghosh
Stefan Seefeld